Categories
Uncategorised

New report outlines SA’s biggest challenges to AI adoption

Take yourself back to February 2020. Life was relatively normal, kids were at school, we physically went into work, and everyone was more certain of the paths they were on. A year later, people of all ages are now a lot more tech savvy, having been forced to work-from-home, do online schooling or have online gatherings, just to keep in touch with loved ones. We have had to embrace the change, and step out of our comfort zones, learning how to use technology to navigate everyday life. While it’s true that South Africa is still behind in digitization, it’s catching up fast thanks to COVID-19, catalyzed by boardrooms across the country focusing on digitization like never before.

One such focus is the efficiency driven by Artificial Intelligence and Machine Learning (AI/ML). SafriCloud surveyed SA’s leading IT decision makers to assess the sentiment and adoption outlook for these technologies amongst business and IT professionals. The results have been published in an eye-opening report entitled, ‘AI: SA – The state of AI in South African businesses 2021’.

‘Keen to start but facing a few challenges’ was the pervasive theme across the survey respondents, but with the global Machine Learning market projected to grow from $7.3 billion in 2020 to $30.6 billion by 2024*, why do we still see resistance to adoption?

Nearly 60% of respondents said that their business supports them in their desire to implement AI/ML and yet only 25% believed that it is understood well at an executive level. While ‘fear of the unknown’ ranked in the top three adoption challenges both locally and internationally (Gartner, 2020), only 9.34% of respondents cited ‘lack of support from C-suite’ as a challenge.

There is a clear degree of pessimism to the level of skills and knowledge to be found in the South African market. This pessimism is more exaggerated at a senior management level where more than 60% rated ‘low internal skill levels’ as the top challenge facing AI/ML adoption. With nearly 60% of the respondents rating the need to implement AI/ML in the next two years as ‘important’ to ‘very important’ and only 35% of businesses saying they currently have internal resources focused on AI/ML, the skills gap will continue to grow.

Artificial Intelligence and Machine Learning represent a new frontier in business. Like previous generations that faced new frontiers – such as personal computing and the industrial revolution – we can’t predict what these changes might lead to. All we can really say is that business will be different, jobs will be different and how we think will be different. Those open to being different will be the ones that succeed.

Get free and instant access to the full report, to discover whether your business is leading the way or falling behind: https://www.safricloud.com/ai-sa-the-state-of-ai-in-south-african-businesses/

Report highlights include:

  • The areas of AI/ML that are focused on the most.
  • The state of the AI job market and how to hire.
  • Practical steps to train and pilot AI/ML projects.
Categories
Machine Learning

How is Coding Used in Data Science & Analytics

What is Data Science?

In recent years the phrase “data science” has become a buzzword in the tech industry. The demand for data scientists has surged since the late 1990s, presenting new job opportunities and research areas for computer scientists. Before we delve into the computer science aspect of data science, it’s useful to know exactly what data science is and to explore the skills required to become a successful data scientist.

Data science is a field of study that involves the processing of large sets of data with statistical methods to extract trends, patterns, or other relevant information. In short, data science encapsulates anything related to obtaining insights, trends, or any other valuable information from data. The foundations of these tasks originate from the fields of statistics, programming, and visualization. In short, a successful data scientist has in-depth knowledge in these four pillars:

  1. Math and Statistics: From modeling to experimental design, encountering something math-related is inevitable, as data almost always requires quantitative analysis.
  2. Programming and Database: Knowing how to navigate program data hierarchies, or big data, and query certain datasets alongside knowing how to code algorithms and develop models is invaluable to a data scientist (more on this below).
  3. Domain Knowledge and Soft Skills: A successful and effective data scientist is knowledgeable about the company or firm at which they are working and proactive at strategizing and/or creating innovative solutions to data issues.
  4. Communication and Visualization: To make their work viable for all audiences, data scientists must be able to weave a coherent and impactful story through visuals and facts to convey the importance of their work. This is usually completed with certain programming languages or data visualization software, such as Tableau or Excel.

Does Data Science Require Coding?

Short answer: yes. As described in points 2 and 4, coding plays a significant role in data science, making appearances in almost every step of the process. Though, how is coding utilized in every step of solving a data science problem? Below, you’ll find the different stages of a typical data science experiment and a detailed account of how coding is integrated within the process. It’s important to remember that this process is not always linear; data scientists tend to ping-pong back and forth between different steps depending on the nature of the problem at hand.

Preplanning and Experimental Design

Before coding anything, it’s necessary for data scientists to understand the problem that is being solved and the desired objective. This step also requires data scientists to figure out which tools, software, and data be used throughout the process. Although coding is not involved in this phase, it can’t be skipped, as it allows a data scientist to keep his or her focus on their objective and not let white noise or unrelated data or results to distract.

Obtaining Data

The world has a massive amount of data that is growing constantly. In fact, Forbes reports that humans create 2.5 quintillion bytes of data daily. From such vast amounts of data arise vast amounts of data quality issues. These issues can be anything, ranging from duplicate or missing datasets and values, inconsistent data, misentered data, or even outdated data. Obtaining relevant and comprehensive datasets is tedious and difficult. Oftentimes, data scientists use multiple datasets, pulling the data they need from each one. This step requires coding with querying languages, such as SQL and NoSQL.

Cleaning Data

After all the necessary data is compiled in one location, the data needs to be cleaned. For example, data which is inconsistently labeled “doctor” or “Dr.” can cause problems when it is analyzed. Labeling errors, minor spelling mistakes, and other minutiae can cause major problems along the road. Data scientists can use languages like Python and R to clean data. They can also use applications, such as OpenRefine or Trifecta Wrangler, which are specifically made to clean data and transform it into different formats.

Analyzing Data

Once a dataset is clean and uniformly formatted, it is ready to be analyzed. Data analytics is a broad term with definitions that differ from application to application. When it comes to data analysis, Python is ubiquitous in the data science community. R and MATLAB are popular as well, as they were created to be used in data analysis. Though these languages have a steeper learning curve than Python, they are useful for an aspiring data scientist, as they are so widely used. Beyond these languages, there are a plethora of tools available online to help expedite and streamline data analysis.

Visualizing Data

Visualizing the results of data analysis helps data scientists convey the importance of their work as well as their findings. This can be done done using graphs, charts, and other easy-to-read visuals, which can allow broader audiences to understand a data scientist’s work. Python is a commonly used language for this step; packages such as seaborn and prettyplotlib can help data scientists make visuals. Other software, such as Tableau and Excel, are also readily available and are widely used to create graphics.

Programming Languages used in Data Science

Python is a household name in data science. It can be used to obtain, clean, analyze, and visualize data, and is often considered the programming language that serves as the foundation of data science. In fact, 40% of data scientists who responded to an O’Reilly survey claimed they used Python as their main coding language. The language has contributors that have created libraries solely dedicated to data science operations and extensions into artificial intelligence/machine learning, making it an ideal choice.

Common packages, such as numpy and pandas, can compute complex calculations with matrices of data, making it easier for data scientists to focus on solutions instead of mathematical formulas and algorithms. Even though these packages (along with others, such as sklearn) already take care of the mathematical formulas and calculations, it’s still important to have a solid understanding of said concepts in order to implement the correct procedure through code. Beyond these foundational packages, Python also has many specialized packages that can help with specific tasks.

R and MATLAB are also popular tools used in data science. They are often used for data analysis and can allow for hypothesis testing to validate statistical models. Though these languages have different setups and syntaxes than Python, the basic logic of the former two languages is based off of the latter, further affirming that Python is a keystone language in data science.

Other popular programming languages, such as Java, can be useful for the aspiring data scientist to learn as well. Java is used in a vast number of workplaces, and plenty of tools in the big data realm are written in Java. For example, TensorFlow is a software library that is available for Java. The list of coding languages that are relevant or being used directly in the field of data science goes on and on, just as the benefits of learning a new computing language are endless.

Case Study: Python, MATLAB, and R

  • At ForecastWatch, Python was used to write a parser to harvest forecasts from other websites.
  • Financial industries leveraged time-series data in MATLAB to backtest statistical models that are used to engineer fund portfolios.
  • In 2014, Facebook transitioned to using mostly Python for data analysis since it was already used widely throughout the firm.
  • R is widely used in healthcare industries, ranging from drug discovery, pre-clinical trial testing, and drug safety data analysis.
  • Sports analysts use R to analyze time-series sports data on certain players in predicting future performances.

Database and Querying

Beyond data analysis, it is imperative to be knowledgeable in querying languages. When obtaining data, data scientists oftentimes navigate multiple databases within different data hierarchies. Languages, such as SQL and its successors, as well as firm-specific cloud navigation systems are key in expediting the data wrangling process. Beyond this, querying languages can also compute basic formulas and operations based on the programmer’s preference.

Case Study: Querying in Data Science

  • The U.S. Congress Database is an open source database that can be queried using pSQL, and can answer questions about the demographics of our legislative branch.
  • When companies acquire smaller firms or startups, they often run into the issue of navigating multiple databases. To ease the process, SQL is a popular language used to navigate data.

Data Science is Growing

In almost every step of the data science process, programming is used to achieve different goals. As the field intensifies and becomes more complex, data scientists will rely more and more heavily on coding to ensure that they can successfully solve more complex problems. For these reasons, it is integral that aspiring data scientists learn to utilize coding to ensure that they are prepared for any role. Because of the rapid amounts of innovation, the field is constantly expanding and data scientist positions are constantly opening at companies of all sizes and fields. In short, data science and its future are nothing short of exciting!

This article originally appeared on junilearning.com

Categories
Artificial Intelligence

Fear of the Unknown: Artificial Intelligence

Artificial Intelligence (AI) will be the most popular and developed technological trend in 2020 with a market value projected to reach $70 billion.

AI is impacting several areas of knowledge and business, from the entertainment sector to the medical field where AI is utilizing high-precision algorithms through machine learning that can produce more accurate diagnoses and detect symptoms of serious diseases at a much earlier stage.

The innovation that AI offers to industry, businesses, and consumers is positively changing all processes. The new decade will be driven by the rise of automation and AI-induced robotics.

However, there is a huge exaggeration and hysteria about the future of Artificial Intelligence and how humans will need to adapt and get used to living with it. In fact, AI is a topic that has polarized popular opinion. What is true is that AI will become the core of everything that humans interact within the coming years, and beyond. Hence, to have a clear opinion about AI and its impact, it is important to understand what it is and what are the types of artificial intelligence that exist.

General Artificial Intelligence (AGI) is the type of AI that can perform any cognitive function in the way a human does. The technology is not there yet but it is developing at a fast pace and there are interesting AI projects such as Elon Musk’s Neuralink. 

Today, narrow AI applications, intended to develop only one task, such as IBM Watson, Siri, Alexa, Cortana, and others are the ones that share the world with us. The key difference between the AGI or wide artificial intelligence and the narrow or weak AI is the goal setting and the volition.

In the future, AGI will have the ability to reflect on its own objectives and decide whether to adjust them or not and to what extent. We have to admit that, if done right, this extraordinary technological achievement will change humanity forever.

However, there is still a long way to go to get to that point. Despite this, many fear that Super Artificial Intelligence (ASI) will one day go beyond human cognition, also known as the technological singularity.

At the moment, in society, there are two emerging and visible groups: on the one hand, the public is informed- in this group, trust towards new and emerging technologies has been increasing over time. On the other hand, there is the mass population -a group where trust remains stagnant.

Of course, social networks also play a role here. It’s not just about consumption, but about amplification, with people who share news more than ever and discuss issues relevant to them. Confidence used to be from top to bottom, but now it is established horizontally from equal to equal.

Will AI benefit or destroy society?

AI can only become what humans want it to become. Humans have the task of coding their AI creations. If the mass population is increasingly anxious about AI, this is due to fear of the unknown. Perhaps it is also because there is very little information available about the benefits AI offers to balance with those who believe that AI will destroy society and take away their jobs.

For now, AI has only been providing great benefits and its coverage in the medium term can only benefit and optimize many areas of human activity.  

Categories
Machine Learning

Python vs. Java: Uses, Performance, Learning

In the world of computer science, there are many programming languages, and no single language is superior to another. In other words, each language is best suited to solve certain problems, and in fact there is often no one best language to choose for a given programming project. For this reason, it is important for students who wish to develop software or to solve interesting problems through code to have strong computer science fundamentals that will apply across any programming language.

Programming languages tend to share certain characteristics in how they function, for example in the way they deal with memory usage or how heavily they use objects. Students will start seeing these patterns as they are exposed to more languages. This article will focus primarily on Python versus Java, which are two of the most widely used programming languages in the world. While it is hard to measure exactly the rate at which each programming language is growing, these are two of the most popular programming languages used in industry today.

One major difference between Python and Java is that Python is dynamically typed, while Java is statically typed. Loosely, this means that Java is much more strict about how variables are defined and used in code. As a result, Java tends to be more verbose in its syntax, which is one of the reasons we recommend learning Python before Java for beginners. For example, here is how you would create a variable named numbers that holds the numbers 0 through 9 in Python:

numbers = []

for i in range(10):

numbers.append(i)

Here’s how you would do the same thing in Java:

ArrayList numbers = new ArrayList();

for (int i = 0; i < 10; i++) {

numbers.add(i);

}

Another major difference is that Java generally runs programs more quickly than Python, as it is a compiled language. This means that before a program is actually run, the compiler translates the Java code into machine-level code. By contrast, Python is an interpreted language, meaning there is no compile step.

Usage and Practicality

Historically, Java has been the more popular language in part due to its lengthy legacy. However, Python is rapidly gaining ground. According to Github’s State of the Octoberst Report, it has recently surpassed Java as the most widely used programming language. As per the 2018 developer survey, Python is now the fastest-growing computer programing language.

Both Python and Java have large communities of developers to answer questions on websites like Stack Overflow. As you can see from Stack Overflow trends, Python surpassed Java in terms the percentage of questions asked about it on Stack Overflow in 2017. At the time of writing, about 13% of the questions on Stack Overflow are tagged with Python, while about 8% are tagged with Java!

Web Development

Python and Java can both be used for backend web development. Typically developers will use the Django and Flask frameworks for Python and Spring for Java. Python is known for its code readability, meaning Python code is clean, readable, and concise. Python also has a large, comprehensive set of modules, packages, and libraries that exist beyond its standard library, developed by the community of Python enthusiasts. Java has a similar ecosystem, although perhaps to a lesser extent.

Mobile App Development

In terms of mobile app development, Java dominates the field, as it is the primary langauge used for building Android apps and games. Thanks to the aforementioned tailored libraries, developers have the option to write Android apps by leveraging robust frameworks and development tools built specifically for the operating system. Currently, Python is not used commonly for mobile development, although there are tools like Kivy and BeeWare that allow you to write code once and deploy apps across Windows, OS X, iOS, and Android.

Machine Learning and Big Data

Conversely, in the world of machine learning and data science, Python is the most popular language. Python is often used for big data, scientific computing, and artificial intelligence (A.I.) projects. The vast majority of data scientists and machine learning programmers opt for Python over Java while working on projects that involve sentiment analysis. At the same time, it is important to note that many machine learning programmers may choose to use Java while they work on projects related to network security, cyber attack prevention, and fraud detection.

Where to Start

When it comes to learning the foundations of programming, many studies have concluded that it is easier to learn Python over Java, due to Python’s simple and intuitive syntax, as seen in the earlier example. Java programs often have more boilerplate code – sections of code that have to be included in many places with little or no alteration – than Python. That being said, there are some notable advantages to Java, in particular its speed as a compiled language. Learning both Python and Java will give students exposure to two languages that lay their foundation on similar computer science concepts, yet differ in educational ways.

Overall, it is clear that both Python and Java are powerful programming languages in practice, and it would be advisable for any aspiring software developer to learn both languages proficiently. Programmers should compare Python and Java based on the specific needs of each software development project, as opposed to simply learning the one language that they prefer. In short, neither language is superior to another, and programmers should aim to have both in their coding experience.

PythonJava
Runtime PerformanceWinner
Ease of LearningWinner
Practical AgilityTieTie
Mobile App DevelopmentWinner
Big DataWinner

This article originally appeared on junilearning.com

Categories
Artificial Intelligence

5 Key Challenges In Today’s Era of Big Data

Digital transformation will create trillions of dollars of value. While estimates vary, the World Economic Forum in 2016 estimated an increase in $100 trillion in global business and social value by 2030. Due to AI, PwC has estimated an increase of $15.7 trillion and McKinsey has estimated an increase of $13 trillion in annual global GDP by 2030. We are currently in the middle of an AI renaissance, driven by big data and breakthroughs in machine learning and deep learning. These breakthroughs offer opportunities and challenges to companies depending on the speed at which they adapt to these changes.

Modern enterprises face 5 key challenges in today’s era of big data

1. Handling a multiplicity of enterprise source systems

The average Fortune 500 enterprise has a few hundred enterprise IT systems, all with their different data formats, mismatched references across data sources, and duplication

2. Incorporating and contextualising high frequency data

The challenge gets significantly harder with increase in sensoring, resulting inflows of real time data. For example, readings of the gas exhaust temperature for an offshore low-pressure compressor are only of limited value in of itself. But combined with ambient temperature, wind speed, compressor pump speed, history of previous maintenance actions, and maintenance logs, this real-time data can create a valuable alarm system for offshore rig operators.

3. Working with data lakes

Today, storing large amounts of disparate data by putting it all in one infrastructure location does not reduce data complexity any more than letting data sit in siloed enterprise systems. 

4. Ensuring data consistency, referential integrity, and continuous downstream use

A fourth big data challenge is representing all existing data as a unified image, keeping this image updated in real-time and updating all downstream analytics that use these data. Data arrival rates vary by system, data formats from source systems change, and data arrive out of order due to networking delays.

5. Enabling new tools and skills for new needs

Enterprise IT and analytics teams need to provide tools that enable employees with different levels of data science proficiency to work with large data sets and perform predictive analytics using a unified data image.

Let’s look at what’s involved in developing and deploying AI applications at scale

Data assembly and preparation

The first step is to identify the required and relevant data sets and assemble them. There are often issues with data duplication, gaps in data, unavailable data and data out of sequence.

Feature engineering

This involves going through the data and crafting individual signals that the data scientists and domain experts think will be relevant to the problem being solved. In the case of AI-based predictive maintenance, signals could include the count of specific fault alarms over the trailing 7 days,14 days and 21 days, the sum of the specific alarms over the same trailing periods; and the maximum value of certain sensor signals over those trailing periods. 

Labelling the outcomes

This step involves labeling the outcomes the model tries to predict. For example, in AI-based predictive maintenance applications, source data sets rarely identify actual failure labels, and practitioners have to infer failure points based on a  combination of factors such as fault codes and technician work orders.

Setting up the training data

For classification tasks, data scientists need to ensure that labels are appropriately balanced with positive and negative examples to provide the classifier algorithm enough balanced data. Data scientists also need to ensure the classifier is not biased with artificial patterns in the data.

Choosing and training the algorithm

Numerous algorithm libraries are available to data scientists today, created by companies, universities, research organizations, government agencies and individual contributors.

Deploying the algorithm into production

Machine learning algorithms, once deployed, need to receive new data, generate outputs, and have some actions or decisions be made based on those outputs. This may mean embedding the algorithm within an enterprise application used by humans to make decisions – for example, a predictive maintenance application that identifies and prioritizes equipment requiring maintenance to provide guidance for maintenance crews. This is where the real value is created – by reducing equipment downtime and servicing costs through more accurate failure prediction that enables proactive maintenance before the equipment actually fails. In order for the machine learning algorithms to operate in production, the underlying compute infrastructure needs to be set up and managed. 

Close-loop continuous improvement

Algorithms typically require frequent retraining by data science teams. As market conditions change, business objects and processes evolve, and new data sources are identified. Organizations need to rapidly develop, retrain, and deploy new models as circumstances change.

Therefore, problems that have to be addressed to solve AI computing problems are nontrivial. Massively parallel elastic computing and storage capacity are prerequisites. In addition to the cloud, there is a multiplicity of data services necessary to develop, provision, and operate applications of this nature. However, the price of missing a transformational strategic shift is steep. The corporate graveyard is littered with once-great companies that failed to change.

This article originally appeared on Makeen Technologies.

Categories
Machine Learning

The Future of the Maritime Logistics Industry: Unmanned ships from 2020

There are no drones only across the sky but also on land and sea and Rolls Royce has focused on the latter for its commercial strategy as far as vessels are concerned.

The company, which no longer manufactures cars -transferred the automobile division to BMW- is a conglomerate that operates in the aeronautical, aerospace, maritime and energy sectors. They have a clear-cut commitment to the seas: launch unmanned ships by mid-2020.

In the meantime, the Rolls-Royce Blue Ocean research team has already launched a virtual reality prototype in its office in Alesund, Norway, which simulates the views from a ship’s command bridge in 360 degrees. The manufacturer hopes that ship captains can maneuver hundreds of unmanned ships from the ground, without any need to approach the sea.

The idea is that during this year the first fleet of unmanned ships will be built. The first would be tugboats or ferries, boats that make simple, short-sized journeys in controlled environments. At first, all risks must be minimized, in order to avoid any possibility of unforeseen events.

The next stage would be the launch of cargo ships, with increasing complexity, especially because they sail in international waters. As of today, there is no legislation that covers unmanned commercial shipping. And the approval of international regulation is always slower than that processed by individual countries.

Unmanned ships, according to Rolls Royce, will reduce operating costs by 20%. Companies, therefore, buy ships to increase their profit margins. The other side of technology is the possible loss of jobs. It will not be necessary to have a crew either a large contingent of security personnel. However, piracy will surely remain a threat that requires the presence of minimal security personnel while keeping in mind that there will not be as many lives at stake in the absence of crew members as the risk for cargo theft.

Although, Rolls-Royce pointed out that new jobs will be created. The operations will have to be performed from the ground. It is an unmanned craft, not autonomous. Cybersecurity will be a key element assuring secured communications links between the ship and land, hence new profiles will be necessary.

By replacing the control bridge along with the other systems where the crew is usually accommodated – including electricity, air conditioning, water, and waste treatment system- the ships will withstand more cargo, reducing costs and increasing revenue. In addition to this, according to the initial calculations, these ships will be 5% lighter and consume between 12 to 15% less fuel ensuring a greener performance. Similarly, electric fuel-free ships are being researched in order to consider their implementation.

Waterfall Office Park Gardens Elevation, Midrand. 1686 | +27 10 634 0880 | info@mlafrica.com

Machine Learning Africa provides insight into emerging machine learning technologies and their inevitable impact in transforming Africa. Provides a platform where innovators, technology vendors, end users and enthusiasts discuss latest innovations and technologies that transform businesses and the broader society.

Contact Us

Waterfall Office Park Gardens Elevation, Midrand. 1686
Tel: +27 10 634 0880
Email: info@mlafrica.com

Gallery