Machine Learning

How is Coding Used in Data Science & Analytics

What is Data Science?

In recent years the phrase “data science” has become a buzzword in the tech industry. The demand for data scientists has surged since the late 1990s, presenting new job opportunities and research areas for computer scientists. Before we delve into the computer science aspect of data science, it’s useful to know exactly what data science is and to explore the skills required to become a successful data scientist.

Data science is a field of study that involves the processing of large sets of data with statistical methods to extract trends, patterns, or other relevant information. In short, data science encapsulates anything related to obtaining insights, trends, or any other valuable information from data. The foundations of these tasks originate from the fields of statistics, programming, and visualization. In short, a successful data scientist has in-depth knowledge in these four pillars:

  1. Math and Statistics: From modeling to experimental design, encountering something math-related is inevitable, as data almost always requires quantitative analysis.
  2. Programming and Database: Knowing how to navigate program data hierarchies, or big data, and query certain datasets alongside knowing how to code algorithms and develop models is invaluable to a data scientist (more on this below).
  3. Domain Knowledge and Soft Skills: A successful and effective data scientist is knowledgeable about the company or firm at which they are working and proactive at strategizing and/or creating innovative solutions to data issues.
  4. Communication and Visualization: To make their work viable for all audiences, data scientists must be able to weave a coherent and impactful story through visuals and facts to convey the importance of their work. This is usually completed with certain programming languages or data visualization software, such as Tableau or Excel.

Does Data Science Require Coding?

Short answer: yes. As described in points 2 and 4, coding plays a significant role in data science, making appearances in almost every step of the process. Though, how is coding utilized in every step of solving a data science problem? Below, you’ll find the different stages of a typical data science experiment and a detailed account of how coding is integrated within the process. It’s important to remember that this process is not always linear; data scientists tend to ping-pong back and forth between different steps depending on the nature of the problem at hand.

Preplanning and Experimental Design

Before coding anything, it’s necessary for data scientists to understand the problem that is being solved and the desired objective. This step also requires data scientists to figure out which tools, software, and data be used throughout the process. Although coding is not involved in this phase, it can’t be skipped, as it allows a data scientist to keep his or her focus on their objective and not let white noise or unrelated data or results to distract.

Obtaining Data

The world has a massive amount of data that is growing constantly. In fact, Forbes reports that humans create 2.5 quintillion bytes of data daily. From such vast amounts of data arise vast amounts of data quality issues. These issues can be anything, ranging from duplicate or missing datasets and values, inconsistent data, misentered data, or even outdated data. Obtaining relevant and comprehensive datasets is tedious and difficult. Oftentimes, data scientists use multiple datasets, pulling the data they need from each one. This step requires coding with querying languages, such as SQL and NoSQL.

Cleaning Data

After all the necessary data is compiled in one location, the data needs to be cleaned. For example, data which is inconsistently labeled “doctor” or “Dr.” can cause problems when it is analyzed. Labeling errors, minor spelling mistakes, and other minutiae can cause major problems along the road. Data scientists can use languages like Python and R to clean data. They can also use applications, such as OpenRefine or Trifecta Wrangler, which are specifically made to clean data and transform it into different formats.

Analyzing Data

Once a dataset is clean and uniformly formatted, it is ready to be analyzed. Data analytics is a broad term with definitions that differ from application to application. When it comes to data analysis, Python is ubiquitous in the data science community. R and MATLAB are popular as well, as they were created to be used in data analysis. Though these languages have a steeper learning curve than Python, they are useful for an aspiring data scientist, as they are so widely used. Beyond these languages, there are a plethora of tools available online to help expedite and streamline data analysis.

Visualizing Data

Visualizing the results of data analysis helps data scientists convey the importance of their work as well as their findings. This can be done done using graphs, charts, and other easy-to-read visuals, which can allow broader audiences to understand a data scientist’s work. Python is a commonly used language for this step; packages such as seaborn and prettyplotlib can help data scientists make visuals. Other software, such as Tableau and Excel, are also readily available and are widely used to create graphics.

Programming Languages used in Data Science

Python is a household name in data science. It can be used to obtain, clean, analyze, and visualize data, and is often considered the programming language that serves as the foundation of data science. In fact, 40% of data scientists who responded to an O’Reilly survey claimed they used Python as their main coding language. The language has contributors that have created libraries solely dedicated to data science operations and extensions into artificial intelligence/machine learning, making it an ideal choice.

Common packages, such as numpy and pandas, can compute complex calculations with matrices of data, making it easier for data scientists to focus on solutions instead of mathematical formulas and algorithms. Even though these packages (along with others, such as sklearn) already take care of the mathematical formulas and calculations, it’s still important to have a solid understanding of said concepts in order to implement the correct procedure through code. Beyond these foundational packages, Python also has many specialized packages that can help with specific tasks.

R and MATLAB are also popular tools used in data science. They are often used for data analysis and can allow for hypothesis testing to validate statistical models. Though these languages have different setups and syntaxes than Python, the basic logic of the former two languages is based off of the latter, further affirming that Python is a keystone language in data science.

Other popular programming languages, such as Java, can be useful for the aspiring data scientist to learn as well. Java is used in a vast number of workplaces, and plenty of tools in the big data realm are written in Java. For example, TensorFlow is a software library that is available for Java. The list of coding languages that are relevant or being used directly in the field of data science goes on and on, just as the benefits of learning a new computing language are endless.

Case Study: Python, MATLAB, and R

  • At ForecastWatch, Python was used to write a parser to harvest forecasts from other websites.
  • Financial industries leveraged time-series data in MATLAB to backtest statistical models that are used to engineer fund portfolios.
  • In 2014, Facebook transitioned to using mostly Python for data analysis since it was already used widely throughout the firm.
  • R is widely used in healthcare industries, ranging from drug discovery, pre-clinical trial testing, and drug safety data analysis.
  • Sports analysts use R to analyze time-series sports data on certain players in predicting future performances.

Database and Querying

Beyond data analysis, it is imperative to be knowledgeable in querying languages. When obtaining data, data scientists oftentimes navigate multiple databases within different data hierarchies. Languages, such as SQL and its successors, as well as firm-specific cloud navigation systems are key in expediting the data wrangling process. Beyond this, querying languages can also compute basic formulas and operations based on the programmer’s preference.

Case Study: Querying in Data Science

  • The U.S. Congress Database is an open source database that can be queried using pSQL, and can answer questions about the demographics of our legislative branch.
  • When companies acquire smaller firms or startups, they often run into the issue of navigating multiple databases. To ease the process, SQL is a popular language used to navigate data.

Data Science is Growing

In almost every step of the data science process, programming is used to achieve different goals. As the field intensifies and becomes more complex, data scientists will rely more and more heavily on coding to ensure that they can successfully solve more complex problems. For these reasons, it is integral that aspiring data scientists learn to utilize coding to ensure that they are prepared for any role. Because of the rapid amounts of innovation, the field is constantly expanding and data scientist positions are constantly opening at companies of all sizes and fields. In short, data science and its future are nothing short of exciting!

This article originally appeared on

Machine Learning

Python vs. Java: Uses, Performance, Learning

In the world of computer science, there are many programming languages, and no single language is superior to another. In other words, each language is best suited to solve certain problems, and in fact there is often no one best language to choose for a given programming project. For this reason, it is important for students who wish to develop software or to solve interesting problems through code to have strong computer science fundamentals that will apply across any programming language.

Programming languages tend to share certain characteristics in how they function, for example in the way they deal with memory usage or how heavily they use objects. Students will start seeing these patterns as they are exposed to more languages. This article will focus primarily on Python versus Java, which are two of the most widely used programming languages in the world. While it is hard to measure exactly the rate at which each programming language is growing, these are two of the most popular programming languages used in industry today.

One major difference between Python and Java is that Python is dynamically typed, while Java is statically typed. Loosely, this means that Java is much more strict about how variables are defined and used in code. As a result, Java tends to be more verbose in its syntax, which is one of the reasons we recommend learning Python before Java for beginners. For example, here is how you would create a variable named numbers that holds the numbers 0 through 9 in Python:

numbers = []

for i in range(10):


Here’s how you would do the same thing in Java:

ArrayList numbers = new ArrayList();

for (int i = 0; i < 10; i++) {



Another major difference is that Java generally runs programs more quickly than Python, as it is a compiled language. This means that before a program is actually run, the compiler translates the Java code into machine-level code. By contrast, Python is an interpreted language, meaning there is no compile step.

Usage and Practicality

Historically, Java has been the more popular language in part due to its lengthy legacy. However, Python is rapidly gaining ground. According to Github’s State of the Octoberst Report, it has recently surpassed Java as the most widely used programming language. As per the 2018 developer survey, Python is now the fastest-growing computer programing language.

Both Python and Java have large communities of developers to answer questions on websites like Stack Overflow. As you can see from Stack Overflow trends, Python surpassed Java in terms the percentage of questions asked about it on Stack Overflow in 2017. At the time of writing, about 13% of the questions on Stack Overflow are tagged with Python, while about 8% are tagged with Java!

Web Development

Python and Java can both be used for backend web development. Typically developers will use the Django and Flask frameworks for Python and Spring for Java. Python is known for its code readability, meaning Python code is clean, readable, and concise. Python also has a large, comprehensive set of modules, packages, and libraries that exist beyond its standard library, developed by the community of Python enthusiasts. Java has a similar ecosystem, although perhaps to a lesser extent.

Mobile App Development

In terms of mobile app development, Java dominates the field, as it is the primary langauge used for building Android apps and games. Thanks to the aforementioned tailored libraries, developers have the option to write Android apps by leveraging robust frameworks and development tools built specifically for the operating system. Currently, Python is not used commonly for mobile development, although there are tools like Kivy and BeeWare that allow you to write code once and deploy apps across Windows, OS X, iOS, and Android.

Machine Learning and Big Data

Conversely, in the world of machine learning and data science, Python is the most popular language. Python is often used for big data, scientific computing, and artificial intelligence (A.I.) projects. The vast majority of data scientists and machine learning programmers opt for Python over Java while working on projects that involve sentiment analysis. At the same time, it is important to note that many machine learning programmers may choose to use Java while they work on projects related to network security, cyber attack prevention, and fraud detection.

Where to Start

When it comes to learning the foundations of programming, many studies have concluded that it is easier to learn Python over Java, due to Python’s simple and intuitive syntax, as seen in the earlier example. Java programs often have more boilerplate code – sections of code that have to be included in many places with little or no alteration – than Python. That being said, there are some notable advantages to Java, in particular its speed as a compiled language. Learning both Python and Java will give students exposure to two languages that lay their foundation on similar computer science concepts, yet differ in educational ways.

Overall, it is clear that both Python and Java are powerful programming languages in practice, and it would be advisable for any aspiring software developer to learn both languages proficiently. Programmers should compare Python and Java based on the specific needs of each software development project, as opposed to simply learning the one language that they prefer. In short, neither language is superior to another, and programmers should aim to have both in their coding experience.

Runtime PerformanceWinner
Ease of LearningWinner
Practical AgilityTieTie
Mobile App DevelopmentWinner
Big DataWinner

This article originally appeared on

Machine Learning

The Future of the Stock Market: Machine Learning-based Predictions

Since the arrival of automated investment and artificial intelligence in the stock markets, the search for the Holy Grail of stock market investment has been to develop and refine the algorithm that would allow for predicting the behavior of the stock market and the actions of the companies in the future listed.

Needless to say, knowing how to predict the future trends of stocks translates into cash and sound money, and it is also necessary to act based on those predictions ahead of other investors, before the scenario is discounted by all in the market. And now there is a new generation of Machine Learning (ML) that yields a success rate in the future that cannot be the result of mere chance: yes, ML already hits a very high percentage, and also with a success rate much larger than the vast majority of human stock advisors, 79% and even 90% in certain cases.

In the stock market, first having always meant earning more money or losing less. Being the first to negotiate literally translates into money. It is taking it for granted that “information is power”, and operating with an anticipatory vision where others are disoriented and giving “blind sticks” in the markets. It goes without saying that in the second it is usually them who end up losing the sticks of losses because there is nothing worse in the bags than having no more strategy than a few misleading hunches. In this, an automated investment may be contributing a lot to the markets, since it establishes clear and synthetic investment rules, and avoids the scenario of breaking them down by human passions that are very dangerous for your pockets, such as euphoria or panic.

But even with an AI that will obviously be marketed, the more massively the better, it is highly likely that those predictions in the future will be available to many investors -human or synthetic. And under this scenario, when many in the market have that prediction with a high probability of being fulfilled, again it must be said that being the first to negotiate will result in money again, with the addition that now the speed will be absolutely decisive to shed profits or losses on each operation.

Only technical analysis is used as a tool for stock predictions because it is considered to be easy-going for the algorithm to learn and the human to interpret, giving predictions where there is only one attribute i.e., historic prices of stock. The current algorithm gives predictions of one single stock that is given as input to find future predictions.

Here are a few companies that use Machine Learning in Technical analysis for stock prediction:

● Trading Technologies.
● GreenKey Technologies.
● Kavout.
● Auquan.
● Epoque.
● Sigmoidal.
● Equbot.
● AITrading.

Recently an Israel based stock forecast company named ‘I Know First’ using predictive Artificial Intelligence demonstrated an accuracy of up to 97% in its predictions for S&P 500 and Nasdaq indices, as well as their respective ETFs. So there’s a lot that can be achieved or explored with the use of Machine Learning in stock prediction.AI is just a new twist to what has already been the virtualization of markets since the arrival of automated investment.

As we said before, this profitable telecommunications-operational symbiosis is not exactly something new as it has been that way since the dawn of automated investment some five years ago. It is true that it began by taking no significant benefits from the small (even imperceptible) market fluctuations, in which the agility of the operation was fundamental since these spikes in share prices can last even for a fraction of a second. If one was able to operate in the same order of temporal magnitude, there was a possible benefit to be taken out of the market. But what is really news now is that, as we will analyze it, this ultra-rapid factor in the operation acquires double relevance under the scenario of the existence of a successful AI algorithm.

We must emphasize that these algorithms may be contributing to improve price formation and to make the market work better, but the negative side is to delegate human decision-making capacity to algorithms that know how they will react to black swan events. Indeed, we said before that human error is being carried away by euphoria or panic, but we said this assuming regular conditions. In scenarios of volatility not suitable for cardiac and black swans, although many investors can continue to fall prey to those unprofitable passions, there is the moment when the value of a mature, professional, and experienced manager is literally worth in gold, being the moment when he should take the helm.

It is necessary to consider as a requirement of the software architecture of the investment programs that something like the automatic pilot of an airplane is implemented: in regular conditions, the aircraft is perfectly piloted by the automated system, but when things look rough, the pilot can regain control of the ship and get passengers out of vital trouble. The automated investment must take these same precautions because today, the human mind is still infinitely more intuitive and analytical than an algorithm that after all is based on historical data which may sometimes not serve as a seed to iterate the learning of artificial intelligence. This can be especially so when we must weigh factors of subjective perception, which can also have a strong influence on the market, and whose subjective complexity is a great degree of added difficulty for an objective robot, not to mention the global cost of continually training with recurring iterations a multitude of investment robots across the planet.

Bid farewell to the simple real-time investment that was new in the 90s and welcome the era of the real-time market investments. We will operate based on an ephemeral ever-changing market scenario, which will cease to exist as soon as we do it together with a certain critical
mass of investors. Never before have your investments had more aggregate capacity to cause disruptions in the market. Welcome to the new reality.

This article is co-authored by Dr. Raul Villamarin Rodriguez and Rajat Toshniwal, Woxsen School of Business

Machine Learning

The Future of HR from 2020: Machine Learning & Deep Learning

The future of HR lies in Deep Learning which is steroid machine learning. It uses a technique that gives machines an improved ability to find, and amplify, even the smallest patterns. This technique is called a deep neural network: deep because it has many layers of simple computational nodes that work together to search for data and deliver a final result in the form of prediction.

Neural networks were vaguely inspired by the inner workings of the human brain. The nodes are like neurons and the network is like the brain itself. But Hinton published his breakthrough at a time when neural networks had gone out of style. No one really knew how to train them, so they were not giving good results. The technique took almost 30 years to recover. But suddenly, it emerged from the abyss.

One last thing we should know in this introduction: machine learning (and deep) comes in three packages: supervised, unsupervised and reinforced.

In supervised learning, the most frequent, the data is labeled to indicate to the machine exactly what patterns to look for. Think of it as something like a tracking dog that will chase the targets once you know the wrapper you’re looking for. That’s what you are doing when you press play on a Netflix program: you are telling the algorithm to find similar programs.

In unsupervised learning, the data has no tags. The machine only searches for any pattern it can find. This is like letting a person check tons of different objects and classify them into groups with similar wrappers. Unsupervised techniques are not as popular because they have less obvious applications but curiously, they have gained strength in cybersecurity.

Finally, we have reinforcement learning, the last frontier of machine learning. A reinforcement algorithm learns by trial and error to achieve a clear objective. He tries many different things and is rewarded or penalized depending on whether his behaviors help or prevent him from reaching his goal. This is like when a child behaves well with a praise and affection. Reinforcement learning is the basis of Google’s AlphaGo, the program that surpasses the best human players in the complex Go game.

Applied to Human Resources, although the growth potential is wide, the current use of Machine Learning is limited and presents a dilemma that must be resolved in the future, related to the ability of machines to discover talent in human beings, beyond their hard and verifiable competencies, such as level of education, etc.

Software intelligence is transforming human resources. At the moment it has its main focus on recruitment processes, which in most cases is a very expensive and inefficient process where our goal is to find the best candidates among thousands of them, although we can find multiple application examples.

A first example would be the development of technology that would allow people to create job descriptions that are gender-neutral to attract the best possible candidates, whether male or female. This would boost a group of job seekers and a more balanced population of employees.

A second example is the training recommendations that employees could receive. On many occasions these employees have many training options, but often they cannot find what is most relevant to them; Therefore, these algorithms present the internal and external courses that best suit the employee’s development objectives based on many variables, including the skills that the employee intends to develop and the courses taken by other employees with similar professional objectives.

A third example will be Sentiment Analysis, which is a form of NLP (Natural Language Processing) that analyzes the social conversations that are generated on the Internet to identify opinions and extract the emotions (positive, negative or neutral) that these implicitly carry. With the sentiment analysis it is determined:

-Who is the subject of the opinion.

-About what is being said.

-How is the opinion: positive, negative or neutral.

This tool can be applied to words and expressions, as well as phrases, paragraphs and documents that we find in social networks, blogs, forums or review pages. The sentiment analysis will determine the hidden connotation behind the information that is subjective.

There are different systems of sentiment analysis:

-Analysis of feeling by polarity: Opinions are classified as very positive, positive, neutral, negative or very negative. This type of analysis is very Simple with reviews made with scoring mechanisms from 1 to 5, where number 1 is very negative and 5 is very positive.

-Analysis of feeling by type of emotion: The analysis detects emotions and specific feelings: happiness, sadness, anger, frustration, etc. For this, there is usually a list of words and the feelings with which they are usually related.

-Sentiment analysis by intention: This system interprets the comments according to the intention behind: Is it a complaint? A question? A request?

A fourth example is the Employee Attrittion through which we can predict which employees will remain in the company and which will not be based on several parameters as shown in the following example-

A screenshot of a cell phone

Description automatically generated
Source: IBM (IBM Watson sample dataset)

These four cases are clear examples in which Machine Learning elevates the role of human resources from tactical processes to strategic processes. Smart software is enabling the mechanics of workforce management, such as creating job applications, recommending courses or predicting which employees are more likely to leave the company, giving the possibility to react in time and apply corrective policies for those deficiencies.

From the business point of view, machine learning technology is an opportunity to drive greater efficiency and better efficiency in decision making. This will help everyone to make better decisions and, equally important, will give Human Resources a strategic and valuable voice at the executive level.

Prof Raul Villamarin Rodriguez

Machine Learning

Supply Chain 4.0: AI and Robotization

Automatic processes, machine learning, and robotization force a constant updating of the knowledge for those professionals responsible for logistics in MNCs and SMEs.

On one hand, ERP systems have become the nerve center of companies and within these the logistics sector is the true heart and engine of all activity. Specialized logistics services companies proliferate and create models that are responsible for the dynamization of international trade, both B2B and B2C.

These systems transcend the mere management we have known so far, processing thousands of data generated from all departments of the company or collected by automata. This data is systematically analyzed and managed by algorithms that generate automatic decisions, learning from successes and errors.

But if there is something that is causing the entire logistics process to change in a radical way, it is robotization. Until date, a large number of personnel dedicated to logistics processes such as product and merchandise handling, order issuance, warehouse and inventory control or replenishment management were needed. However, robots are replacing these functions and ending, to a large extent, with the need for labour. These are capable of carrying a load of up to 500 Kg from one end to another of the warehouse. And even move it from one warehouse to another. In the same way, they can rotate 360 ​​degrees on its axis, rise to load merchandise or deposit it gently at any point. We need to consider that they do much faster than humans without getting sick or requiring rest.

Within five minutes, the robots recharge and have a range of 4 or 5 hours. So they can cover a full 8-hour shift with a total of ten minutes of recharging. Consequently, they can perfectly cover three daily shifts. It has no conflict between them or collective claims and they are immediately replaceable in case of breakdown or need for maintenance.

The logistics robotization process is generating profound changes in the business model that affect all areas, especially the human resource management.

According to reports from the World Economic Forum, by 2025 the replacement of human personnel with robots in all basic professional areas will have reached 52%. This means the loss of countless unskilled jobs. That will be compensated with the creation of 58 million qualified jobs, necessary for the robotic revolution, in the next 10 years.

It is not difficult to think that technical qualifications will be one of the challenges to overcome in this whole process.

Featured examples of robotization in large multinationals

Two of the most prominent precursors within this logistics reorganization have been two giants of online commerce; Alibaba and Amazon.

Amazon’s experience

Amazon has more than one hundred thousand robots dedicated to managing the orders of its customers and all its stores are currently automated. Quite the opposite of destroying employment, the company has doubled the workforce since 2016, currently having more than 500,000 workers.

Kiva robots, used by the firm, can easily replace physical work and repetitive activities that are easily programmable. But the same does not happen with another set of required skills that are demanded in new positions to add value.

Alibaba’s model

With its logistics model, it has facilitated the penetration of thousands of companies in different international markets. Process automation and artificial intelligence are the engines of productivity in our day and this, in turn, is the key factor in competitiveness.

Only through these logistic processes is possible to manage the huge volume of orders for days like Black Friday or the Alibaba shopping festival.

JD’s case

Another Chinese marketing giant, JD has recently surpassed Alibaba with a warehouse capable of processing more than 200,000 orders daily with only the supervision of 4 people. The objective of this company is to provide service to all of China on the same day as long as the order is generated before 11 am.

The company has not only invested millions in robots for warehouses but also done so in the incorporation of automatic systems in trucks, means of transport and distribution drones.

Prof. Raul V. Rodriguez

Machine Learning

The Future of the Maritime Logistics Industry: Unmanned ships from 2020

There are no drones only across the sky but also on land and sea and Rolls Royce has focused on the latter for its commercial strategy as far as vessels are concerned.

The company, which no longer manufactures cars -transferred the automobile division to BMW- is a conglomerate that operates in the aeronautical, aerospace, maritime and energy sectors. They have a clear-cut commitment to the seas: launch unmanned ships by mid-2020.

In the meantime, the Rolls-Royce Blue Ocean research team has already launched a virtual reality prototype in its office in Alesund, Norway, which simulates the views from a ship’s command bridge in 360 degrees. The manufacturer hopes that ship captains can maneuver hundreds of unmanned ships from the ground, without any need to approach the sea.

The idea is that during this year the first fleet of unmanned ships will be built. The first would be tugboats or ferries, boats that make simple, short-sized journeys in controlled environments. At first, all risks must be minimized, in order to avoid any possibility of unforeseen events.

The next stage would be the launch of cargo ships, with increasing complexity, especially because they sail in international waters. As of today, there is no legislation that covers unmanned commercial shipping. And the approval of international regulation is always slower than that processed by individual countries.

Unmanned ships, according to Rolls Royce, will reduce operating costs by 20%. Companies, therefore, buy ships to increase their profit margins. The other side of technology is the possible loss of jobs. It will not be necessary to have a crew either a large contingent of security personnel. However, piracy will surely remain a threat that requires the presence of minimal security personnel while keeping in mind that there will not be as many lives at stake in the absence of crew members as the risk for cargo theft.

Although, Rolls-Royce pointed out that new jobs will be created. The operations will have to be performed from the ground. It is an unmanned craft, not autonomous. Cybersecurity will be a key element assuring secured communications links between the ship and land, hence new profiles will be necessary.

By replacing the control bridge along with the other systems where the crew is usually accommodated – including electricity, air conditioning, water, and waste treatment system- the ships will withstand more cargo, reducing costs and increasing revenue. In addition to this, according to the initial calculations, these ships will be 5% lighter and consume between 12 to 15% less fuel ensuring a greener performance. Similarly, electric fuel-free ships are being researched in order to consider their implementation.

Waterfall Office Park Gardens Elevation, Midrand. 1686 | +27 10 634 0880 |

Machine Learning Africa provides insight into emerging machine learning technologies and their inevitable impact in transforming Africa. Provides a platform where innovators, technology vendors, end users and enthusiasts discuss latest innovations and technologies that transform businesses and the broader society.

Contact Us

Waterfall Office Park Gardens Elevation, Midrand. 1686
Tel: +27 10 634 0880


Machine Learning

Will Artificial Intelligence reach the level of the human intellect by 2040?

Technological singularity is a hypothesis that predicts that there will come a time when artificial intelligence will be able to improve itself recursively. In theory, machines that are capable of creating other machines even more intelligent, resulting in intelligence far superior to human beings and, which could be even more shocking, beyond our control.

AI, Machine Learning, Neural Networks… these are terms that transmit feelings which are equally of hope and fear of the unknown.

In the next 20 years, there will be more technological changes than in the last 2 millennia. The technology is much faster than the brain – a calculator multiplies 5-digit numbers in tenths of a second – but it works differently, for example, it does not have the level of connections equivalent to that of neurons in a human brain.

However, if the exponential speed of Moore’s law does not stop and the investigations of neural networks of giant corporations such as Google continue to advance by 2040 the degree of technological integration in our lives will far exceed the capacity of the human brain.

The word singularity was taken from astrophysics: a point in space-time – for example, inside a black hole – in which the rules of ordinary physics are not lost. It was associated with the explosion of artificial intelligence during the 1980s by science-fiction novelist Vernor Vinge. At a NASA symposium in 1993, Vinge predicted that in 30 years there would be technological means to create superhuman intelligence called Singleton which refers to a “world order in which there is a single decision-making entity at the highest level, capable of exerting effective control over its domain and preventing internal or external threats to its supremacy”. In addition to this, he assured that, shortly after, we would reach the end of the human era.

Throughout history, some technological advances have caused fear. The fear of the new and the unknown is understandable, however, all technologies can be modified for good or for evil, as you can use fire to heat and cook food, or to burn people

In the case of the singularity, it seems clear that one must be cautious, regulating its development but without limiting it and, above all, trying to ensure that this future artificial intelligence learns from ethical and moral values, as well as from mistakes and successes of the species. We must be clear in our conception of the term. Human beings and machines are meant to co-exist in symbiosis and not rivalry. 

Mortality as an “option” by 2045?

On the other hand, we could analyze if mortality will be “optional” by 2045. Google has already started extravagant research initiatives as they realized that curing aging is possible and that is why they are creating companies such as ‘Calico’ or ‘Human Longevity’, which are investigating it, but also non-profit organizations such as the Methuselah Foundation. It is evident that the possibilities are real since immortality already exists in nature. Some cells are immortal and the stem cells affected the quality of reproducing indefinitely, just like cancer cells.

One of the steps to achieve this is to fully comprehend the structure of incurable diseases today, and then eradicate them. Thus, as it happens with HIV, a controllable chronic disease, or diabetes. We must propose the same with aging: turn it into a controllable chronic disease, and later on, cure it for good. It is essential to begin human trials with rejuvenation technologies that have been shown useful in other animals leading to advancements in human clinical trials as well. 

Prof. Raul V. Rodriguez is an Asst. Professor at Universal Business School.

Machine Learning

Predicting people’s driving personalities

Self-driving cars are coming. But for all their fancy sensors and intricate data-crunching abilities, even the most cutting-edge cars lack something that (almost) every 16-year-old with a learner’s permit has: social awareness.

While autonomous technologies have improved substantially, they still ultimately view the drivers around them as obstacles made up of ones and zeros, rather than human beings with specific intentions, motivations, and personalities.

But recently a team led by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has been exploring whether self-driving cars can be programmed to classify the social personalities of other drivers, so that they can better predict what different cars will do — and, therefore, be able to drive more safely among them.

In a new paper, the scientists integrated tools from social psychology to classify driving behavior with respect to how selfish or selfless a particular driver is.

Specifically, they used something called social value orientation (SVO), which represents the degree to which someone is selfish (“egoistic”) versus altruistic or cooperative (“prosocial”). The system then estimates drivers’ SVOs to create real-time driving trajectories for self-driving cars.

Testing their algorithm on the tasks of merging lanes and making unprotected left turns, the team showed that they could better predict the behavior of other cars by a factor of 25 percent. For example, in the left-turn simulations their car knew to wait when the approaching car had a more egoistic driver, and to then make the turn when the other car was more prosocial.

While not yet robust enough to be implemented on real roads, the system could have some intriguing use cases, and not just for the cars that drive themselves. Say you’re a human driving along and a car suddenly enters your blind spot — the system could give you a warning in the rear-view mirror that the car has an aggressive driver, allowing you to adjust accordingly. It could also allow self-driving cars to actually learn to exhibit more human-like behavior that will be easier for human drivers to understand.

“Working with and around humans means figuring out their intentions to better understand their behavior,” says graduate student Wilko Schwarting, who was lead author on the new paper that will be published this week in the latest issue of the Proceedings of the National Academy of Sciences. “People’s tendencies to be collaborative or competitive often spills over into how they behave as drivers. In this paper, we sought to understand if this was something we could actually quantify.”

Schwarting’s co-authors include MIT professors Sertac Karaman and Daniela Rus, as well as research scientist Alyssa Pierson and former CSAIL postdoc Javier Alonso-Mora.

A central issue with today’s self-driving cars is that they’re programmed to assume that all humans act the same way. This means that, among other things, they’re quite conservative in their decision-making at four-way stops and other intersections.

While this caution reduces the chance of fatal accidents, it also creates bottlenecks that can be frustrating for other drivers, not to mention hard for them to understand. (This may be why the majority of traffic incidents have involved getting rear-ended by impatient drivers.)

“Creating more human-like behavior in autonomous vehicles (AVs) is fundamental for the safety of passengers and surrounding vehicles, since behaving in a predictable manner enables humans to understand and appropriately respond to the AV’s actions,” says Schwarting.

To try to expand the car’s social awareness, the CSAIL team combined methods from social psychology with game theory, a theoretical framework for conceiving social situations among competing players.

The team modeled road scenarios where each driver tried to maximize their own utility and analyzed their “best responses” given the decisions of all other agents. Based on that small snippet of motion from other cars, the team’s algorithm could then predict the surrounding cars’ behavior as cooperative, altruistic, or egoistic — grouping the first two as “prosocial.” People’s scores for these qualities rest on a continuum with respect to how much a person demonstrates care for themselves versus care for others.

In the merging and left-turn scenarios, the two outcome options were to either let somebody merge into your lane (“prosocial”) or not (“egoistic”). The team’s results showed that, not surprisingly, merging cars are deemed more competitive than non-merging cars.

The system was trained to try to better understand when it’s appropriate to exhibit different behaviors. For example, even the most deferential of human drivers knows that certain types of actions — like making a lane change in heavy traffic — require a moment of being more assertive and decisive.

For the next phase of the research, the team plans to work to apply their model to pedestrians, bicycles, and other agents in driving environments. In addition, they will be investigating other robotic systems acting among humans, such as household robots, and integrating SVO into their prediction and decision-making algorithms. Pierson says that the ability to estimate SVO distributions directly from observed motion, instead of in laboratory conditions, will be important for fields far beyond autonomous driving.

“By modeling driving personalities and incorporating the models mathematically using the SVO in the decision-making module of a robot car, this work opens the door to safer and more seamless road-sharing between human-driven and robot-driven cars,” says Rus.

The research was supported by the Toyota Research Institute for the MIT team. The Netherlands Organization for Scientific Research provided support for the specific participation of Mora.

Machine Learning

ML Africa successfully hosted the inaugural AI & The Future of Healthcare Summit

Artificial intelligence and machine learning are the most trending and dominating technologies of our times. These are shaping the future and impacting on our daily lives. For businesses and government, adoption and agile adoption of these technologies is imperative.

Machine learning Africa celebrates its successful hosting of the inaugural AI & The Future of Healthcare Summit at Hilton Sandton on the 30th of October 2019. It was a wonderful event where technology enthusiasts were sharing insights into the development of AI driven healthcare solutions that improve patient outcomes.

The discussions focused on the future of healthcare, patient engagement, the public and private sector collaboration, digital health strategy, AI in Radiology, precision medicine, the future of robots in healthcare, diagnostic technologies and upskilling healthcare workforce.

Key note speakers included: Prof. Nelishia Pillay, Head of the Department of Computer Science at the University of Pretoria, Johan Steyn, AI Enthusiast, Portfolio Lead: DevOps & Software at IQBusiness South Africa, Joel Ugborogho, Founder of CenHealth, Dr. Jonathan Louw, MB.ChB, MBA, CEO of the South African National Blood Service (SANBS), Basia Nasiorowska, CEO at NEOVRAR, Josh Lasker, Co-Founder, Abby Health Stations, Dr. Jaishree Naidoo, Paediatric Radiologist and CEO of Envisionit Deep AI, Prof. Antonie van Rensburg, PrEng, Chief Digital Officer IoTDot4, Dr. Darlington Mapiye (PhD) Technical Lead for the data driven healthcare team at IBM Research Africa, Dr. Boitumelo Semete, Executive Cluster Manager, CSIR, and  Yusuf Mahomedy, Chief Executive of the Association Executive Network of Southern Africa (AENSA)

The event was made possible through partnership with Envisionit Deep AI, a medical technology company that utilizes artificial intelligence to streamline and improve medical imaging diagnosis for radiologists. Their AI model RADIFY will augment and improve the radiology reading and thereby relieve the bottlenecks we face in medical imaging. Other event partners present were Evolutio, SANBS, IQBusiness, IoTDot4 and ICITP.

 If you would like to increase your proficiency further in emerging technologies and deploy the most effective strategies within your organization, the Digital Health workshop would be another exciting and relevant event to consider. Entitled ‘Accelerating Digital Health Services’, the workshop is brought in partnership with Cenhealth, on the 5th of December 2019. In preparation for the upcoming changes in the healthcare industry, it is imperative for all healthcare institutions not be left behind in their digital transformation journey.