Shortcut to seniority

Concepts and trending technologies

Robotic Process Automation (RPA)

Robotic refers to the entities which mimic human actions (Robots).

A process is a sequence of steps to be performed to complete an action.

Automation is any process done by a robot without human interaction.

RPA is the process of automating operations with the help of robots.

The robots are capable of mimicking most human actions, such as logging into applications, copy and pasting data, filling forms, extract data from documents, accessing databases (read or write), and many others. Basically, they release the humans from the tasks that they prefer to never do again.

In the traditional automation tools, a developer will produce a list of actions that are required to automate a task, by using APIs of the system and scripting languages.

RPA is developing the action list by watching how the user perform a task in a GUI, and then repeats those operations directly on the GUI.

The difference is that the RPA is most likely designed to be used by non-technical people, in comparison to automation tools which require development knowledge.

For example, people could train the robots by doing drag and drop or click some buttons.

Robots could also work with unstructured data input, by recognizing the text in a document (OCR) and then working with that text.

Other than that, RPA is supposed to cover an entire toolchain of possibilities, while test automation focus on one specific thing (for example web automation tools allow you to control the browser, but if you need to open a document, or interact with the file system, you’re in trouble).

Impact of employment

There’s also the impact of employment to be considered here (“Technology is here to steal our jobs”).

Many people are afraid of losing their jobs to the robots, while companies promise to keep the workers, but provide them with other, more complex work.

While this can also impact the required headcount for a company, the technology can also be used to achieve more work and better productivity with the same number of people.

Big data

Big data is a term that refers to large volume of data sets. Challenges when using big data include capturing, storing, analyzing, searching, querying, visualizing, updating them etc.

After analyzing the data, we can reduce costs and time, develop new products, or making smart decisions regarding the direction of the company.

The big data sets can be mined for insights, and are characterized by:

A huge volume of data
A broad variety of data types
The velocity at which the data needs to be processed and analyzed

The data that form the sets can come from multiple sources, such as web sites, social media, mobile apps, etc.

Big data analytics

It doesn’t matter how big your volume of data is, but what you do with it.

Big data analytics allows the companies to gain huge insights about their customers and predict what is important in the future, for the business. Analysis can be performed to identify patterns in the data, or to apply statistical techniques in order to confirm an assumption or invalidate it.

Through these analytics, the companies can improve their customer service, increase their sales or improve their efficiency. They can also be used defensively, by identifying patterns and suspicious activity that might raise some concerns, and help mitigate the risks associated with such behavior.

Infrastructure

Big data cannot work unless there’s an infrastructure set in place to gather, store, process, and secure the data.

The security infrastructure tools could include data encryption, user authentication, monitoring systems, and other products to protect the systems and the data.

Big data technologies

Apache Hadoop is one of the technologies that is strongly associated with big data. Hadoop is an open source framework that allows you to process large data sets across multiple computers, scaling from one server to thousands.

Apache Spark is an open source cluster-computing framework, which serves as an engine for processing big data within Hadoop.

NoSQL databases store and manage data in a way that allows speed and flexibility, and unlike SQL databases, they can be scaled horizontally across thousands of servers.

Internet of things (IOT)

The internet of things (IoT) is an ecosystem of interconnected devices, and it evolved from machine to machine communication. These devices uses embedded processors, sensors and communication hardware to collect, send, and act on data they acquire from their environments.

In other words, IoT is a network of connected devices, used to gather and share data.

IoT is the heart of smart home devices, such as those that automatically adjust heating or lightning.

We can say that IoT has three main parts: the Things, the networks, and the systems.

Things

‘Things’ refer to any IoT device that uses some embedded sensors to collect data. Such sensors detect events or changes in the environment, and send the data to the internet or towards other connected devices, which can act on that data. For example, a device could be a lightbulb, which can be turned on and off using a smartphone application.

Systems

Systems refers to the servers that receives the data from the IoT devices, which collects and processes the data on the fly, and even trigger specific actions.

Security is a very important topic, and right now it’s one of the biggest issues that we have with the IoT. The sensors that we have collect sensitive data, such as what you do, what you say, where you are, etc.

If hackers compromise such systems, they can easily track your location or eavesdrop on your conversations.

Privacy is also important, because there’s so much data that can be gathered through smart devices.

While these smart devices are there to ease your life, there’s still the incertitude of the companies that build such smart devices, mainly because they can sell your data to other companies.

Trade-offs

On the bright side, if we have an alarm in the house (proximity sensor alarmed) and the system detects movement (e.g. burglar), it can send such data over internet to the customer (phone sms for example, or activating a nearby video camera to live stream the situation), lock the automatic door and call the police, providing GPS location and other important information.

On the other side, if your smart lock at your entrance door is vulnerable and easily hackable, the attacker can simply walk into your house.

Cloud

Cloud computing

Cloud, or Cloud computing, is a type of outsourcing of computer services. The users can simply use storage and computing power, without worrying how they work internally. With cloud computing, technology-enabled services from the internet (“cloud”) are provided “As a service”, allowing users to access it without having knowledge or control over the technologies behind the servers.

Cloud storage

Cloud storage services may be accessed through a cloud computing service, a web service API, or by applications that uses the API, such as cloud desktop storage.

Cloud storage providers are responsible for keeping the data available and accessible, and the physical environment protected and running, whereas people buy or lease storage capacity from the providers.

Why use cloud?

Using cloud services is very helpful for both small and big companies.

Imagine that you no longer need a team of experts to install, configure, test, run, secure, and update all the hardware and software within your system.

You no longer have to worry about your own data, because that’s now the responsibility of a 3^rd party vendor.

Upgrades are automatic, scaling up and down is easy, and you only pay for what you need and use.

And if you want to move your business from one country to another – you don’t need to carry all that hardware with you – it’s in the cloud !

Everything As A Service

There are a few options available ‘as a service’, such as:

Infrastructure as a Service (IaaS)
Platform as a Service (PaaS)
Software as a Service (SaaS)

Infrastructure as a Service means that you pay in order to have access to storage, networking, and computing resources. You can choose to have your own storage and your own servers, or to rent them from a cloud service provider. This service is the underlying infrastructure for PaaS or SaaS. The provider is responsible for the management and maintenance of the hardware, for the security, network monitoring, recovery after a failure, and load balancing. With IaaS, you can focus on your business, and not on the infrastructure itself.

IaaS should be used for companies that don’t want to commit to hardware / infrastructure investments.

Platform as a Service means that you pay in order to have a cloud-based platform that is used to build, to test and to deliver applications. PaaS provides a scalable framework for the developers to host and configure their key services.

The provider is responsible for running and managing the development tools and databases.

PaaS should be used in projects where multiple developers work on the same application, and in projects that require a high level of customization.

Software as a service means that you pay in order to have access to a third-party software, based on subscriptions.

This means that you don’t need to install or run applications, but offer them as a web service instead. Basically, you pay for having access to the business’s software, while the provider runs and manages the software on their servers.

Unlike IaaS and PaaS, SaaS does not require any technical skill or expertise.

SaaS should be used for short term projects with remote teams and for easy access on both web and mobile.

Augmented Reality and Virtual Reality

Augmented Reality

AR keeps the real world central but enhances it with other digital details, adding an additional layer in which digital content can be displayed.

The easiest way to experience AR is through an AR app in your smartphone.

The application will display the world through the phone camera, while 3D graphical objects will be displayed over it, tricking your brain into thinking that the digital object is actually there - as long as you watch the world through the phone.

A more interesting experience is through AR headsets, which is similar to wearing glasses, through which digital content is displayed.

Imagine seeing the world through glasses that enhance the ability to see - such as doing real-time facial recognition on people around you and showing their name and age over their heads.

Or imagine driving the car and seeing some relevant things drawn through the windshield, such as rectangles over the pedestrians, information about the weather, the road, the speed you’re driving, a mini-map with the path towards destination, or arrows to show you which way you should go.

Virtual Reality

VR is the most known of these technology, and provides a fully immersive experience, which tricks your senses into thinking that you are in a different environment, by using a headset that fully covers your vision.

Through the headset, the user can experience a computer generated world in which they can manipulate objects or move around using controllers.

Whereas the video games are displayed on the screen and you look into it, while being stuck on the chair, and still seeing everything happening around you, the headset, headphones, and controllers will provide a much better experience.

Because the headset is attached to your head, the camera will move at the same time with your head, something that was not available until now, in video games.

Blockchain

Blockchain is a time-stamped series of immutable data that is distributed (spread across multiple networks).
The block is the digital information, whereas the chain is the public database.
A block in a blockchain is a collection of data.
We add the data to the block by connecting it with other blocks, in chronological order, creating a chain of blocks that are linked together.
Blockchain is the definition of a democratized system: All the information is open for everyone to see, making the system transparent, and decisions are taken together as a group.

The blockchain is a linked list which contains some data and a hash pointer which points to its previous block, creating the chain.
A hash pointer is similar to a pointer, but contains not only the address of the previous block, but also the hash of the data inside of it.
If a hacker attempts to attack block 5 and change the data, any change in the data will change the hash.
Any change made in block 5 will change the hash which is stored in block 4, which will change the data and the hash of block 3, and so on, completely changing the chain.

Because this is impossible, this is how blockchains assure immutability of the data. Blockchain will solve any problems coming from missed transactions, human/machine errors, or exchanges that are done without the consent of the parties.

Key features

Peer-to-peer

There is no central authority that controls or manipulates the blockchain. All the participants talk to each other directly, allowing for the data exchange to be made directly between two parties without other interference.

Decentralization

The database is distributed across the entire network, which does not make way of data manipulation.

Until now, we knew only about centralized entities, which stored all the data (server) and with which we’d have to interact to get the information.

For example, the banks are a centralized system, because they store all your money, and the only way to pay someone is by going through the bank. The problem is that because the systems are centralized, there are a few vulnerabilities:

all the data is stored in one place.
when a software upgrade is performed / maintenance etc, the entire system is down.
if the entire system goes down for some reason, nobody will be able to access their information (eg. if a bank fully closes, the money you own will disappear).

Privacy / Transparency

Both of these go hand in hand.

All the transactions are transparent and everyone can see the transaction history of other people. However, the privacy is not gone, because when checking transactions, you’ll see their public address, not their identity, which is hidden by complex cryptography logic.

Immutability

Data can be added in the blockchain only in a time-sequential order. Once the data is added to the blockchain, it is almost impossible to change it.

Consensus

Probably the most important feature, the data within a blockchain can be updated only via consensus.

Since there’s no central authority in control of updating the data, any update that must be done to the blockchain is validated against criterias defined by the blockchain protocal, and added to it only after a concensus has been reached between all the participants on the network.

How does it work

In simple terms, blockchain technology is a way of passing some information from A to B in a secure matter. A transaction represents any action performed in a blockchain, and most commonly it’s a data structure that represents a transfer of value between users within the network.

The structure contains some relevant data such as the source, destination, validation information, the value to be transferred, etc.
The process starts when a node (user, participant) creates a transaction, digitally signing it with its private key.
The transaction is propagated to peers that validate the transaction (Gossip protocol).
This works as follows: One participant sends 1 EUR to another - The nearest nodes will find out about this too, telling this further to the nodes closest to them, until everyone finds out, similar to floodfill algorithm.

Once this is done, the transaction is considered confirmed, and it is included in a block and added into the chain, becoming part of the blockchain.

Afterwards, it is propagated across the network, because the chain is stored across the entire network.

The next block will link itself cryptographically back to this block, through a hash pointer.

Real life example

Let’s assume we want to buy some plane tickets, so we search online on what’s the cheapest price.

The prices are different because the companies are taking a cut for processing / mediating the transaction.

With blockchain, the airline company can move the entire process to the blockchain, replacing all the vendors that rely on charging fees for transactions.

The parties of the transaction will be the airline company and the clients, and each ticket is a block that is added to the ticket blockchain. Therefore, your ticket is a unique entry, easily verifiable and unfalsifiable.

Specific terms

Public / Private Key

The public key could be a locker and the private key could be the locker combination.

Everyone can insert letters through the opening in the locker, but only the person that knows the combination can open it. However,if you forget your private key, you’ve lost the access to the wallet, forever.

Wallet

A blockchain wallet is a digital wallet that allows the users to store cryptocurrency.

Cryptocurrency is a virtual currency that uses cryptography for security.

A wallet ID is a unique identifier (similar to a bank account number).

The wallet should show the current wallet balance and most recent transactions. Through the wallet, the users can send currency, or buy / sell / exchange cryptocurrencies.

The wallet cointains the user’s public key, while the private key is used to sign any transactions.

Mining

Mining refers to finalizing a block (adding transactions to the existing blockchains), by creating a hash of a block of transactions, protecting the integrity of the entire blockchain.

The miners verify the transactions by making sure that the cryptocurrency was not already spent in a previous blockchain of the system.

This is done by solving a complex computational math problem (proof of work) - fetching a 64-digit hexadecimal number which is less or equal than the target hash.

The only way to do this is by using bruteforce (trying all the possible combinations).

This proof of work is a piece of data which is difficult to produce but easy for others to verify.

This is intentionally expensive for the miners because it also makes it expensive for a malicious miner to attack the network.

In order to correctly solve a block, miners have to manipulate the hash of a block until that hash is below a value (target hash).

In order to manipulate the hash of a block, the miners include a random number within their block (the term is “a nonce” - a number used once).

In order to find the correct nonce, the miners choose a nonce, include it in the contents, hash the block and compare it with the target hash.

Once the block is solved, the block is broadcasted to the network, and the entire network checks that the block is indeed correct by rehashing that block and validating that the value is lower than the target hash.

When the number of miners increase, the blocks are discovered faster, and in order to keep the difficulty up and have 1 block being found every 10 minutes, the target hash can be adjusted so that the difficulty will increase for miners to solve a block.

As compensation for those efforts, the miners are awarded cryptocurrency whenever they add a new block of transactions to the blockchain.

Let’s make a simple analogy here:

I write on a piece of paper a number between 1 and 100, and i ask 3 people to guess any number that is less than or equal to that number.

There is no limit to how many guesses they have, and only the first one that answers correctly will get the prize.

With blockchain, I ask millions of miners to guess, and i think of a 64-digit hexadecimal number instead.

Mining pool

A mining pool is a group of miners who combine their computing power and split the mined currency between participants.

51% attack

A 51% attack refers to an attack on a blockchain, when a group of miners control more than 50% of the network computing power.

If that happens, they would be able to prevent new transactions, therefore interrupting any payment between users, and reverse transactions, allowing them to get back their coins.

The further the transactions are, the more difficult it would be to change them, and past a checkpoint, even impossible, as they could be hardcoded into the software.

Artificial intelligence (AI)

Artificial Intelligence allows the machines to learn from experience and adjust to new inputs.

With Artificial Intelligence, the computers can be trained to accomplish tasks by processing large amounts of data and recognizing patterns in those data.

Artificial Intelligence is a broad field, that includes many methods and technologies.

Neural networks

Neural networks are a set of algorithms that are modelled after the human brain and that are designed to recognize patterns in data. Neural networks map inputs to outputs.

Classification tasks depend on labeled datasets – this is known as supervised learning (we’ll talk about it later).

Clustering / grouping is the detection of similarities, and does not require labels to detect similiarities – this is known as unsupervised learning (we’ll also talk about it later).

Let’s talk about the simplest neural network, which is a single layer neural network, called Perceptron.

Perceptron is a linear binary classifier that can be used in supervised learning, helping classifying data, and it consists of 4 parts: The input values, the weights and bias, the net sum, and the activation function.

image with caption — Figure xxx - Artificial Intelligence - Neural networks - Perceptron

How does it work?

The following steps will be performed:

Multiply all the inputs with their weights.
Add all the multiplied values. This sum is called Weighted Sum.
Apply the Weighted sum to the Activation function.

An example of the activation function is the following:

Weights show the strength of a particular node, or in simpler terms, how much influence does the input node has on the output.

The bias allows you to shift the activation function curve up or down, to better fit the data.

Activation function is used to map the input between the required values such as (0, 1).

If it’s not clear yet, let’s give a real example.

Let’s say we want to implement an e-mail spam classifier.

We need to see what features are available in spam mails and not available in non-spam emails, such as spelling mistakes, or findings of the word ‘buy’.

If we do a manual classification, we could find that most of non-spam e-mails contain the word ‘buy’ less than two times and less than three spelling mistakes.

The real strength is when we combine these rules. Let’say we add the number of the word ‘buy’ with the number of spelling mistakes.

We can notice that the value is lower for non-spam e-mails, and higher for spam e-mails.

Graphically, we can plot them as follows:

The horizontal axis is used for the number of appearance of the word ‘buy’.
The vertical axis is used for the number of spelling mistakes.

What we need is for the computer to place the line correctly.

Step one is to start with a random position of the line and ask the points where the line should be.

While the red points are ok with the line (they are all under it), the blue points need the line to pass them, so the blue points will ask the line to move a bit up.

Step two is to pick a large number of repetitions (epochs), where an epoch represents one iteration over the entire dataset.

Step three is to pick a small number and set it as learning rate (let’s say 0.01).

Step four is the loop (repeat until we reach the number of epochs we want), in which we pick a random point, and we check its classification result:

If it’s correctly classified, we do nothing
If it’s incorrectly classified, we move the line towards the point

How do we move the line towards the points? We do this by either translation or rotation movement.

In our example, the equation was: 2x + 3y = 6 (or 2x + 3y + (-6) = 0).

Translation: If we change the result from 6 to a higher value (7), the line will move up, and if we change it to a lower value (5), the line will move down.

Rotation is when we rotate through a pivot, and in our case we have two pivots, the x and y axis.

The point of intersection for x=0 means that y=6/3 => 2. Changing the value of 3 will rotate the line towards the x pivot, because the fraction decreases/increases so the line rotates because the point of intersection of the line and the x axis is moved to the left/right.

The point of intersection for y=0 means that x=6/2 => 3. Changing the value of 2 will rotate the line towards the y pivot, based on the same reasoning.

The learning rate is the value we use for incrementing/decrementing when moving the line. In Machine Learning, we want to take small steps, so let’s set the rate to something as small as 0.1.

Real example

We start with a random line of equation ax + by + c = 0, we set the number of epochs (eg. 1000) and the learning rate (eg 0.01).

We then repeat 1000 times and do the following:

We pick a random point (p, q).
If the point is correctly classified we do nothing
Otherwise, we modifiy the a,b,c based on its location in relation with the line.
If the point is blue and is in the red area:
Substract 0.01p to a.
Substract 0.01q to b.
Substract 0.01 to c.
If the point is red and is in the blue area:
Add 0.01p to a.
Add 0.01p to b.
Add 0.01 to c.

Positive and negative regions

Every point above the line will give us a positive value and every point below the line will give us a negative value.

Therefore:

the red area (spam mail) means that ap + bq + c > 0.
The blue area (non-spam mail) means that ap+bq+c < 0.

The training item is a class that contains the inputs and the expected result. The line 11 is there to allow the Perceptron class (which we will define later) to access its private members.

The main function creates a few entries and adds them into a vector of TrainingItems.

We train the network to learn that we expect the result to be false for small values (0.2 – 0.4) and true for bigger values (0.5 – 0.7).

                                            
                                                class TrainingItem {    
                                                public:    
                                                TrainingItem(bool _expectedResult, const vector<double> &_inputs)    
                                                    : expectedResult(_expectedResult)    
                                                    , inputs(_inputs)    
                                                {    
                                                }    
                                                private:    
                                                bool expectedResult;    
                                                const vector<double> inputs;    
                                                friend class Perceptron;    
                                                };

We then create a Perceptron object and train the network, and then assert the result returned by the perceptron for values already given and also new items that match our criteria (less than a given value, bigger than a given value).

The perceptron receives the number of training items (in order to allocate enough items for the weights), the learning rate, and the threshold.

The perceptron will keep the weights and update them based on the activation result.

                                            
                                                int main() {    
                                                vector<TrainingItem> trainingSet =    
                                                {    
                                                    TrainingItem(false,{ 1, 0.2 }),    
                                                    TrainingItem(false,{ 1, 0.3 }),    
                                                    TrainingItem(false,{ 1, 0.4 }),    
                                                    TrainingItem(true, { 1, 0.5 }),    
                                                    TrainingItem(true, { 1, 0.6 }),    
                                                    TrainingItem(true, { 1, 0.7 })    
                                                };     
                                                    
                                                Perceptron perceptron(2);    
                                                perceptron.train(trainingSet, 10000);    
                                                    
                                                assert(perceptron.getActivationResult({ 1, 0.1 }) == false);    
                                                assert(perceptron.getActivationResult({ 1, 0.3 }) == false);    
                                                assert(perceptron.getActivationResult({ 1, 0.5 }) == true);    
                                                assert(perceptron.getActivationResult({ 1, 1.0 }) == true);    
                                                    
                                                return 0;    
                                                }

On the next page we have the full implementation of the perceptron. The train function (Line 13) iterates through the set multiple times, learns (updates the weights) for each item, and stops if no errors were detected at the end of the loop.

The activation result function returns 0 or 1 based on the dot product in relation with the threshold.

The learn function will change the weight (increase or decrease) for each item, based on the learning rate.

                                            
                                                class Perceptron {    
                                                public:    
                                                Perceptron(int inputCount, const double _learningRate, const double _threshold)    
                                                    : learningRate(_learningRate)    
                                                    , threshold(_threshold)    
                                                    , weights(inputCount)    
                                                {    
                                                }    
                                                const double learningRate;    
                                                const double threshold;    
                                                vector<double> weights;    
                                                    
                                                void train(vector<TrainingItem>& trainingSet, const unsigned int epochs) {    
                                                    unsigned int currentEpoch = 0;    
                                                    while (++currentEpoch < epochs) {    
                                                        int errors_detected = 0;    
                                                    
                                                        for (const TrainingItem& item : trainingSet) {    
                                                            bool actual_result = learn(item);    
                                                            if (actual_result != item.expectedResult) {    
                                                            errors_detected++;    
                                                            }    
                                                        }    
                                                    
                                                        if (errors_detected == 0) {    
                                                            cout << "Perceptron trained after " << currentEpoch << " epochs";    
                                                            break;    
                                                        }    
                                                    }    
                                                }    
                                                    
                                                bool getActivationResult(const vector<double>& inputs) {    
                                                    double dot_product = inner_product(    
                                                        inputs.begin(), inputs.end(), weights.begin(), 0);    
                                                    return dot_product > threshold;    
                                                }    
                                                    
                                                bool learn(const TrainingItem& item) {    
                                                    bool activation_result = getActivationResult(item.inputs);    
                                                    if (activation_result != item.expectedResult) {    
                                                        double error = (item.expectedResult ? 1 : 0) - (activation_result ? 1 : 0);    
                                                        for (int i = 0; i < weights.size(); i++) {    
                                                            weights[i] += learningRate * error * item.inputs[i];    
                                                        }    
                                                    }    
                                                    return activation_result;    
                                                }    
                                                };

Machine learning

Machine learning is the field that allows computers the capability to learn without being explicitly programmed for a given task.

In other words, it gives the computer the ability to learn.

It does so by automating and improving the learning process based on experience.

The process starts by providing data and training the machine by building machine learning models using the data and different algorithms.

Supervised training

Supervised learning happens when the model is getting trained on a labeled dataset - a set which contains both the input and output parameters (expected result).

Image classification is a supervised training problem – we define a set of objects we want to identify and train a model to recognize them through photos already labeled.

When we train the model, the data should be split in 80:20 ratio.

80% of the data is used to train the model and 20% is used to test it.

The model will learn only from the training data, and once it’s ready, we test it by providing input from the remaining 20% (data which the model has never seen before).

The accuracy can be calculated by comparing the predictions received from the model (output) with the actual results from the dataset.

The calculation method is done by using an input variable ‘x’ and an output variable ‘y’, and an algorithm to map the input to the output. ‘y = f(x)’.

Let’s give an example.

We have 100 images of apples and bananas.

We use 80 images to train the model.

if shape of the object is round and its color is red, it will be labeled as an apple.
if shape of the object is long cylinder and its color is green-yellow, it will be labeled as a banana

Afterwards, we take the remaining 20 images and let the model identify it properly.

Unsupervised training

Unsupervised learning is the training of the machine using data that is neither classified nor labeled.

The algorithm will act on the information without having any guidance by the user.

The calculation method is done by using an input variable ‘x’ and no corresponding output variable.

Let’s give an example.

We have 100 images of fruits: apple, bananas, cherry, and strawberries.

We use 80 images to train the model.

The computer knows nothing about anything, so it needs multiple epochs to discover what’s similar.

Let’s take a trait (color) and group objects by it.

We arrange the images by color, and provide first all the red ones (apples, cherry, strawberry), all the green-yellow ones (bananas), and then all the green ones (apples).

Let’s take another trait (size) and group objects by it.

We arrange the images by size, and provide first all the small ones (Strawberries, cherry), then the medium ones (apples) and then the big ones (bananas).

Let’s take another trait (shape) and group objects by it.

We arrange the images by shape, and provide first all the round ones (Strawberries, cherry, apples), and then cylinder ones (bananas).

Now, the computer should be able to recognize and differentiate between them properly.

We can see that because the fruits are similar in color, size and shape, they could be wrongfully classified - because they are very similar.

Deep learning

Deep learning is a subset of machine learning, in which an algorithm performs a task multiple times, and tweak it a little so it can improve the outcome. This allows a machine to perform better with time, similar to humans. The more the deep learning algorithm learns, the better they will perform.

Deep learning gets us closer and closer to the raw data without the need of the human involvement – it has the ability to form representation of the raw data instead of having a person first extracting the features with which the ML algorithms can work with.

With machine learning, we train an algorithm to differentiate between two objects (cats vs dogs) and then we further use that algorithm to clasify new images of cats and dogs.

With deep learning, we do the same, but each time we provide it a new image of cats or dogs, it also improves itself, therefore making the learning a continuous process.

Color	Size	Shape	Result
Red	Small	Round	Strawberry / cherry
Red	Medium	Round	Apple
Green-yellow	Big	Cylinder	Bananas
Green	Medium	Round	Apple (green apple)

Natural Language Processing (NLP)

Natural Language Processing (NLP) is a subset of Artificial Intelligence that is focused on enabling computers to understand and process human languages.

NLP further consist of two subsets: Natural Language Understanding (NLU) and Natural Language Generation (NLG).

Natural Language Understanding uses AI to understand the input which is in the form of text or speech format and to interpret meaning from a text.

NLU can be applied in many fields, such as:

Chat bot: Customer chat services which receive messages and respond back to the customer.
Speech recognition: Voice assistance
Machine translation: used to translate text from one language to another

The following steps are performed to process the text:

Sentence segmentation

Sentence segmentation means breaking the text apart into separate sentences. This could be as simple as splitting the sentences based on the dot character, or split it into paragraphs.

Tokenization

Tokenization is the process of breaking strings into tokens, such as words.

Stemming

Stemming is the process of normalizing words into its base / root form, by cutting the end or beginning of the word, taken into account the prefixes and suffixes of a word. For example, given the input ‘going’, we will get the word ‘go’.

Lemmatization

Lemmatization is similar to stemming, as it maps several words into one common root, and its output is a proper word, but the difference is that in this step, we map multiple different tokens to the same output.

For example, a lemmatization process will map ‘gone’, ‘going’ and ‘went’ into ‘go’.

POS tags

POS tags is the process of assigning each token its grammatical type/part of speech(word, noun, adjective, article). A word can have more than one part of speech based on its context.

Identification of stop words

There are some words used as filler (and, the, a) which can be flagged as stop words – and filter them out before doing any analysis on the text.

Named entity recognition

Named entity recognition is the process of detecting named entities (person names, company names, location, dates and times, amounts of money, names of events) within the text.

Chunking

Chunking is the process of picking up pieces of information and grouping them into bigger pieces called chunks.

Natural Language Generation (NLG)

Natural Language Generation is the process of transforming data into written narative.

It all started with a simple gap-filling approach within a template, in which the text had a predefined structure and only a small amount of data had to be filled in, from a database entry, file or spreadsheet.

With the rise of Artificial intelligence, NLG is now capable of dynamically creating documents.

If someone wants to describe the status of a cryptocurrency, and it has a table regarding the price the currency had on a monthy basis, NLG can create an article describing how the prices have raised or dropped, what and when was the peak, provide some statistical data or insights and so much more.

The narrative design (template) is very important here, which is constructed by the provider of the software or by its end user. The narrative design contains rules that trigger different outputs based on the data, contains the writing style, or its structure.

With NLG, we can produce as many unique narratives as we want, in much less time than it would take for us humans to think and write them ourselves manually.

NLG models rely on a number of algorithms that address certain problems of creating human-like texts. Two of the most common algorithms being used are Markov's chain and RNNs.

Markov chain

Markov chain is one of the first algorithms for language generation, and the model predicts the next word in the sentence by using the current word and the relationship with other words. This is also the one our smartphones use currently to predict our next words when we write a text.

Recurrent Neural Networks (RNN)

Recurrent Neural Networks (RNN) are models that try to mimic the human brain, by passing each item of the sequence through a feed-forward network and use the output of that model as input for the next item in the sequence.

Computer Vision

Computer vision is a subset of machine learning which refers to teaching computers how to understand images and videos (videos are a sequence / series of images).

Cameras can take pictures by converting lights into pixels, but these numbers do not carry meaning. To take pictures is not the same as to understand what’s in the picture. What we want to teach the computers is to see and understand what’s in a picture just like humans, such as naming objects, identifying people, understanding 3D geometries from a 2D image, emotions, and much more.

Big data is currently used in relation with computer vision, by fetching millions of images, cleaning them, labeling them into categories, and feeding them to a machine for further processing.

Computer vision can be applied in many fields, such as:

Face recognition: Face recognition refers to finding people in images. Given an image, the computer should tell us whether there’s a person or not in that image, and localize them by placing a box around them.
Facial expression recognition: Facial expression recognitions refer to not only finding a human in an image, but also recognize what is his state or emotion – is he happy, is he smiling?
Object recognition: Object recognition refers to finding specific objects in images. Although a dataset is required (ML), we still need to feed the input image to the machine.
Robot Vision: Robot / robotic vision can be used in many fields, for example in autonomous driving (cars being driven without any drivers).
Computer graphics: Computer graphics refers to any processing that can be performed on images, such as adding and removing objects from one image / video to another.

Shortcut to seniority

Badea Robert