AI & Machine Learning News. 22, June 2020

The Artificial Intelligence and Machine Learning Newsletter by CloudQuant

The Artificial Intelligence and Machine Learning News clippings for Quants are provided algorithmically with CloudQuant’s NLP engine which seeks out articles relevant to our community and ranks them by our proprietary interest score. After all, shouldn’t you expect to see the news generated using AI?


AI Tool Turns Blurry Human Photo Into Realistic Computer-Generated HD Faces

Duke University researchers have announced that they have developed an artificial intelligence-based tool that can turn blurry and unrecognisable images of people’s faces into perfect computer-generated portraits in high definition.

According to the reports, traditional methods can only scale up a human face image up to eight times than its original resolution; however, the researchers from the Duke University have developed this AI tool called PULSE, which can create a realistic-looking image which is 64 times the resolution of the input photo. This tool searches through artificial intelligence-generated high-resolution faces images as an example and analyses facial features like fine lines, eyelashes and stubble to match ones that look similar to the input image after actual size compression.

When asked, co-author Sachit Menon from the Duke University stated to the media, “While the researchers focused on faces as a proof of concept, the same technique could, in theory, take low-res shots of almost anything and create sharp, realistic-looking pictures, with applications ranging from medicine and microscopy to astronomy and satellite imagery.”

2020-06-15 07:39:58+00:00 Read the full story…
Weighted Interest Score: 2.0882, Raw Interest Score: 1.0886,
Positive Sentiment: 0.0294, Negative Sentiment 0.1177

CloudQuant Thoughts : Sometimes these demonstrations look a little data leaky (example, the input and output both wearing glasses when the downscale had no evidence of glasses at all). Data leakage can be easy to disprove  and fun for your reviewers if you provide a web front end and allow us to submit our own lo-res images!

For fun, here are the Wolfenstein and Doom Characters upscaled to real humans. And a final one to check for bias in the training data…

Image

Now AI Can Recreate How Artists Painted Their Masterpieces

Recently, the researchers from MIT introduced a new AI system known as Timecraft that has the capability to synthesise time-lapse videos depicting how a given painting might have been created. According to the researchers, there are various possibilities and unique combinations of brushes, strokes, colours, etc. in a painting and the goal behind this research is to learn to capture this rich range of possibilities.

Creating the exact same piece of a famous painting can take days even by skilled artists. However, with the advent of AI and ML, we have witnessed the emergence of a number of AI Artists for a few years now. One of the most popular artisanship of AI is the portrait of Edmond Belamy that was created by Generative Adversarial Network (GAN) and sold for an incredible $432,500.

In this research, the researchers presented a recurrent probabilistic model that can take an image of a finished painting and create a time-lapse video depicting how it was most likely to have been painted by the original artist. The system was trained on more than 200 existing time-lapse videos that people posted online of both digital and watercolour paintings.

2020-06-22 05:30:00+00:00 Read the full story…
Weighted Interest Score: 3.0490, Raw Interest Score: 1.6900,
Positive Sentiment: 0.1855, Negative Sentiment 0.3710

CloudQuant Thoughts : It is very low res at the moment but the potential is magnificent.

Computer makers unveil 50 AI servers with Nvidia’s A100 GPUs

Computer makers are unveiling a total of 50 servers with Nvidia’s A100 graphics processing units (GPUs) to power AI, data science, and scientific computing applications. The first GPU based on the Nvidia Ampere architecture, the A100 is the company’s largest leap in GPU performance to date, with features such as the ability for one GPU to be partitioned into seven separate GPUs as needed, Nvidia said. The company made the announcement ahead of the ISC High Performance online event, which is dedicated to high-performance computing.

Nvidia said it now has eight of the top 10 fastest supercomputers in the world, as measured by ISC.

Unveiled in May, the A100 GPU has 54 billion transistors (the on-off switches that are the building blocks of all things electronic) and a server with eight A100 GPUs like the Nvidia DGX A100 can execute 5 petaflops of performance, or about 20 times more than the previous-generation chip Volta. This means central processing unit (CPU) servers that cost $20 million and take up 22 racks can be replaced by new servers that cost $3 million and take up just four GPU-based server racks, said Nvidia product marketing director Paresh Kharya in a press briefing.
2020-06-22 00:00:00 Read the full story…
Weighted Interest Score: 2.5808, Raw Interest Score: 1.4544,
Positive Sentiment: 0.1929, Negative Sentiment 0.1484

CloudQuant Thoughts : I was just thinking how quiet Nvidia had been during the lockdown, then they come out with “eight of the top 10 fastest supercomputers in the world are powered by Nvidia”. VERY IMPRESSIVE!

Space and the profusion of data – the new development frontier?

Space is not just for astronauts. It’s the next frontier for tackling humanity’s most intractable problems such as food security, climate change and social inequality, as revealed at the first World Space Forum late last year.

Developing countries are crossing over the space frontier with a growing number of maiden satellite launches and inaugural space initiatives. Yet many lack capabilities to navigate through the vast profusion of data acquired by space technologies, namely through satellite Earth observation, and satellite positioning systems, as well as to effectively utilize satellite communications.

To avoid a leap into the dark and to reap long-term benefits from emerging space programs, developing countries need to address their capacity constraints in processing the tide of raw data that flows from satellites. The process of filtration, refinement and modelling for translating data into usable information in forecasting models requires huge computing capacities and appropriate skills in machine learning and artificial intelligence.

2020-06-10 Read the full story…

Solar Data Analytics Approaches Warp Speed

Data scientists at NASA are employing GPU-powered workstations and local storage to greatly accelerate analysis of images captured by the Solar Dynamic Observatory.

Launched in 2010 to probe our yellow dwarf star and its magnetic field, the solar observatory carries three instruments: an Atmospheric Imaging Assembly, Extreme Ultraviolet Variability Experiment and a Helioseismic and Magnetic Imager. As of the observatory’s 10th anniversary, NASA said SDO has so far captured more than 350 million images of the sun.

Parked in an inclined geosynchronous orbit, SDO is part of NASA’s “Living With a Star” program designed to study the sun as a “magnetic variable star” and how it influences life on Earth. Solar flares can, for example, disrupt critical infrastructure like electrical grids and literally fry electronics.

The challenge for solar data scientists is the sheer volume of imagery—about 20 petabytes and counting. The observatory collects data by recording images of the sun every 1.3 seconds—about as “dynamic” as a space sensor gets. Processing those images requires algorithms to remove errors such as “bad pixels.” Cleaned-up images are then archived.

2020-06-19 00:00:00 Read the full story…
Weighted Interest Score: 2.9470, Raw Interest Score: 1.2475,
Positive Sentiment: 0.1418, Negative Sentiment 0.3402

CloudQuant Thoughts : A couple of great articles about the huge volumes of data hitting our planet from outer space!!

The trouble with climate finance – Green investing has shortcomings – The financial system and climate change

The financial industry reflects society, but it can change society, too. One question is the role it might play in decarbonising the economy. Judged by today’s fundraising bonanza and the solemn pronouncements by institutional investors, bankers and regulators, you might think that the industry is about to save the planet. Some 500 environmental, social and governance (esg) funds were launched last year, and many asset managers say they will force companies to cut their emissions and finance new projects. Yet, as we report this week (see article), green finance suffers from woolly thinking, marketing guff and bad data. Finance does have a crucial role in fighting climate change but a far more rigorous approach is needed, and soon.

One of the shortcomings of green finance might be called “materiality”. Some fee-hungry fund managers make hyperbolic claims about their influence, even as big-business bashers pin most of the blame for pollution on companies. The reality is more prosaic. Fund managers have some influence over a big slice of the economy, but many emissions occur outside the firms they control. Estimates by The Economist suggest that publicly listed firms, excluding state-controlled ones, account for 14-32% of the world’s total emissions, depending on the measure you use. Global fund managers cannot directly influence the bosses of state-controlled Chinese coal-fired power plants or Middle Eastern oil and gas producers.

2020-06-20 21:01:53.030000+00:00 Read the full story…

CloudQuant Thoughts : One can argue that ESG based models are outperforming the market because they are an excellent predictor of the future direction of investment as younger people with different goals become a a key demographic. However, it would be just as easy to argue that the Oil, Gas and Coal industries have had a torrid time with the Corona Virus shut-down and by simply avoiding these industries one could have outperformed and ESG based strategy. Bust ESG is demonstrating Alpha and CloudQuant has an ESG dataset available on our Catalog page which includes a white paper, code and data to facilitate simple reproduction of the results.


CVPR 2020

Top Computer Vision Datasets Open-Sourced At CVPR 2020

A good dataset serves as the backbone of an Artificial Intelligence system. Data assists in various ways as it helps understand how the system is performing, understand meaning insights and others. At the premier annual Computer Vision and Pattern Recognition conference (CVPR 2020), several datasets have been open-sourced in order to help the community achieve higher accuracies and insights.

Below here we have listed the top 10 Computer Vision datasets that are open-sourced at the CVPR 2020 conference.

2020-06-19 12:30:28+00:00 Read the full story…
Weighted Interest Score: 5.9147, Raw Interest Score: 1.5534,
Positive Sentiment: 0.1098, Negative Sentiment 0.1098

Everything So Far At CVPR 2020 Conference

Computer Vision and Pattern Recognition (CVPR) conference is one of the most popular events around the globe where computer vision experts and researchers gather to share their work and views on the trending techniques on various computer vision topics, including object detection, video understanding, visual recognition, among others. This year, the Computer Vision (CV) researchers and engineers have gathered virtually for the CVPR 2020 conference from 14 June, which will last till 19 June. In this article, we have listed down all the important topics and tutorials that have been discussed on the 1st and 2nd day of the conference.

2020-06-22 07:30:00+00:00 Read the full story…
Weighted Interest Score: 2.6493, Raw Interest Score: 1.6044,
Positive Sentiment: 0.2149, Negative Sentiment 0.1003

Everything So Far At CVPR 2020 Conference – Part 2

With about 7000 attendees, the 6 days virtual conference on computer vision concluded a plethora of paper presentations, workshops and tutorials. From the breakthroughs on computer vision to open-sourcing datasets and projects, this conference was loaded with interesting topics and areas including autonomous driving, video sensing, action recognition, and much more.

We have already covered the topics and tutorials from day 1 and 2, i.e. June 14t…
2020-06-22 07:30:00+00:00 Read the full story…
Weighted Interest Score: 2.6493, Raw Interest Score: 1.6044,
Positive Sentiment: 0.2149, Negative Sentiment 0.1003


Facebook just released a database of 100,000 deepfakes to teach AI how to spot them

The videos are designed to help improve AI’s performance—as even the best methods are still not accurate enough.

Deepfakes⁠ have struck a nerve with the public and researchers alike. There is something uniquely disturbing about these AI-generated images of people appearing to say or do something they didn’t. With tools for making deepfakes now widely available and relatively easy to use, many also worry that they will be used to spread dangerous misinformation. Politicians can have other people’s words put into their mouths or made to participate in situations they did not take part in, for example.

That’s the fear, at least. To a human eye, the truth is that deepfakes are still relatively easy to spot. And according to a report from cybersecurity firm DeepTrace Labs in October 2019, still the most comprehensive to date, they have not been used in any disinformation campaign. Yet the same report also found that the number of deepfakes posted online was growing quickly, with around 15,000 appearing in the previous seven months. That number will be far larger now. Social-media companies are concerned that deepfakes could soon flood their sites. But detecting them automatically is hard. To address the problem, Facebook wants to use AI to help fight back against AI-generated fakes. To train AIs to spot manipulated videos, it is releasing the largest ever data set of deepfakes⁠—more than 100,000 clips produced using 3,426 actors and a range of existing face-swapping techniques.

Facebook has also announced the winner of its Deepfake Detection Challenge, in which 2,114 participants submitted around 35,000 models trained on its data set via Kaggle. The best model, developed by Selim Seferbekov, a machine-learning engineer at mapping firm Mapbox, was able to detect whether a video was a deepfake with 65% accuracy when tested on a set of 10,000 previously unseen clips, including a mix of new videos generated by Facebook and existing ones taken from the internet.

Read the Full Story…

Principal Component Analysis (PCA) from scratch in Python

Principal Component Analysis is a mathematical technique used for dimensionality reduction. Its goal is to reduce the number of features whilst keeping most of the original information. Today we’ll implement it from scratch, using pure Numpy.

If you’re wondering why PCA is useful for your average machine learning task, here’s the list of top 3 benefits:

  1. Reduces training time — due to smaller dataset
  2. Removes noise — by keeping only what’s relevant
  3. Makes visualization possible — in cases where you have a maximum of 3 principal components

The last one is a biggie — and we’ll see it in action today.

But why is it a biggie? Good question. Imagine that you have a dataset of 10 features and want to visualize it. But how? 10 features = 10 physical dimensions. We as humans kind of suck when it comes to visualizing anything above 3 dimensions — hence the need for dimensionality reduction techniques.

I want to make one important note here — principal component analysis is not a feature selection algorithm. What I mean is that principal component analysis won’t give you the top N features like for example forward selection would do. Instead, it will give you N principal components, where N equals the number of original features.
2020-06-20 21:01:53.030000+00:00 Read the full story…
Weighted Interest Score: 2.5634, Raw Interest Score: 1.2551,
Positive Sentiment: 0.0000, Negative Sentiment 0.0000

3 Ways Traders Are Gaining Exposure to Rapid Changes in Technology

Innovative technologies are reshaping the way that we as humans live and work. Companies that specialize in the types of products that are changing the global economy are the focus of long-term investors and active traders alike. As you’ll see in the charts below, now could be an ideal time to increase exposure to this in-demand market segment.

SPDR Kensho New Economies Composite ETF (KOMP) – Investors who are most interested in adding exposure to innovative companies are often prudent to examine the top holdings of exchange-traded products such as the SPDR Kensho New Economies Composite ETF (KOMP). For those unaware, the fund’s managers seek to utilize artificial intelligence and quantitative weighting to track an index of companies that leverage exponential processing power, robotics, AI, and automation.
2020-06-16 17:33:06.824000+00:00 Read the full story…
Weighted Interest Score: 4.4941, Raw Interest Score: 1.9053,
Positive Sentiment: 0.3176, Negative Sentiment 0.1155

CloudQuant Thoughts : Investing in the Shares that make up the majority of a successful ETF is a good way to quickly identify key movers.

Top 8 Algorithms for Object Detection

Object detection has been witnessing a rapid revolutionary change in the field of computer vision. Its involvement in the combination of object classification as well as object localisation makes it one of the most challenging topics in the domain of computer vision. In simple words, the goal of this detection technique is to determine where objects are located in a given image called as object localisation and which category each object belongs to, that is called as object classification.

In this article, we list down the 8 best algorithms for object detection one must know..

  1. Fast R-CNN
  2. Faster R-CNN
  3. Histogram of Oriented Gradients (HOG)
  4. Region-based Convolutional Neural Networks (R-CNN)
  5. Region-based Fully Convolutional Network (R-FCN)
  6. Single Shot Detector (SSD)
  7. Spatial Pyramid Pooling (SPP-net)
  8. YOLO (You Only Look Once)

2020-06-14 Read the full story…

How to Build a Simple Machine Learning Web App in Python Part 2: An ML-Powered Web App in Less than 50 Lines of Code

In this article, I will show you how to build a simple machine learning powered data science web app in Python using the streamlit library in less than 50 lines of code.
The data science life cycle is essentially comprised of data collection, data cleaning, exploratory data analysis, model building and model deployment. For more information, please check out the excellent video by Ken Jee on Different Data Science Roles Explained (by a Data Scientist). A summary infographic of this life cycle is shown below:

As a Data Scientist or Machine Learning Engineer, it is extremely important to be able to deploy our data science project as this would help to complete the data science life cycle. Traditional deployment of machine learning models with established framework such as Django or Flask may be a daunting and/or time-consuming task. Video Link
2020-06-14 https://towardsdatascience.com/how-to-build-a-simple-machine-learning-web-app-in-python-68a45a0e0291Read the full story…

Pagaya raises $102 million to manage assets with AI

Pagaya, an AI-driven institutional asset manager that focuses on fixed income and consumer credit markets, today announced it raised $102 million in equity financing. CEO Gal Krubiner said the infusion will enable Pagaya to grow its data science team, accelerate R&D, and continue its pursuit of new asset classes including real estate, auto loans, mortgages, and corporate credit.

Pagaya applies machine intelligence to securitization — the conversion of an asset (usually a loan) into marketable securities (e.g., mortgage-backed securities) that are sold to other investors — and loan collateralization. It eschews the traditional method of securitizing pools of previously assembled asset-backed securities (ABS) for a more bespoke approach, employing algorithms to compile discretionary funds for institutional investors such as pension funds, insurance companies, and banks. Pagaya selects and buys individual loans by analyzing emerging alternative asset classes, after which it assesses their risk and draws on “millions” of signals to predict their returns.

2020-06-17 Read the Full Story…

The Difference Between Various Data Science Job Titles

As the data science field has blown in popularity, it is important to note that there are other job titles with an overlap of functions. Job titles are so confusing nowadays that one company might label a designation something that is completely different somewhere, and so mainly focuses on what the responsibilities, technical skills and experiences will be when it comes to job titles related to data. In this article, we take a look at such similarities and differences in data job titles.

Data Scientist vs Data Engineer vs Data Analyst vs Statistician.

2020-06-22 08:30:00+00:00 Read the full story…
Weighted Interest Score: 5.5458, Raw Interest Score: 2.7735,
Positive Sentiment: 0.1761, Negative Sentiment 0.1101

Broadridge Unveils AI-Driven Corporate Bond Trading Platform

Broadridge Financial Solutions, Inc. (NYSE: BR), a global fintech leader, today announced that its new AI-driven corporate bond trading platform, LTX®, has executed its first trades. Broadridge has partnered with Jim Toffey, founder of Tradeweb Markets, to create LTX, which combines powerful artificial intelligence (AI) with a new digital execution protocol that enables broker-dealers to significantly improve market liquidity, efficiency and execution for their buy-side customers.

Built on Broadridge’s US Fixed Income post-trade processing platform, which processes over $6 trillion in notional volume per day across 40+ dealer clients, LTX uses AI (LTX AISM) to help broker-dealers digitize their franchise to maximize liquidity for asset managers while delivering improved transparency, “BestEx” and minimizing information leakage.

2020-06-17 12:32:28+00:00 Read the full story…
Weighted Interest Score: 5.3747, Raw Interest Score: 2.7275,
Positive Sentiment: 0.6373, Negative Sentiment 0.1275

Artificial intelligence job growth crashed because of the coronavirus, but it’s starting to pick back up. Here’s what you need to know about the job market and how to pick up skills

As the coronavirus crisis has shrunk the job market in general, AI job growth has slowed too. Both LinkedIn and ZipRecruiter saw a decrease in AI job posting growth since mid-March, but there are signs that AI job growth could bounce back — maybe even stronger than before. Below find eight online resources for job-seekers looking to pick-up AI expertise or skills.

Artificial intelligence has been one of the hottest areas of tech and the economy in the last few years, and AI job growth has reflected that: AI roles ranked at the top of LinkedIn’s Job Of Tomorrow report in December, and the World Economic Forum estimated in January that 16% of new jobs would be in AI.

Then came the COVID-19 pandemic and corresponding economic downturn, which led to up to 40 million US jobs disappearing. How is the “job of tomorrow” faring now? New data from job boards shows that while the number of AI jobs is still growing, that growth has slowed dramatically during the coronavirus crisis.

2020-06-15 00:00:00 Read the full story…
Weighted Interest Score: 5.2723, Raw Interest Score: 2.1838,
Positive Sentiment: 0.1514, Negative Sentiment 0.3676

Deploying Machine Learning Has Never Been This Easy

According to PwC, AI’s potential global economic impact will reach USD 15.7 trillion by 2030. However, the enterprises who look to deploy AI are often hampered by the lack of time, trust and talent. Especially, with the highly regulated sectors such as healthcare and finance, convincing the customers to imbibe AI methodologies is an uphill task.

Of late, the AI community has seen a sporadic shift in AI adoption with the advent of AutoML tools and introduction of customised hardware to cater to the needs of the algorithms. One of the most widely used AutoML tools in the industry is H2O Driverless AI. And, when it comes to hardware Intel has been consistently updating its tool stack to meet the high computational demands of the AI workflows.

Now H2O.ai and Intel, two companies who have been spearheading the democratisation of the AI movement, join hands to develop solutions that leverage software and hardware capabilities respectively.

2020-06-19 07:20:42+00:00 Read the full story…
Weighted Interest Score: 5.1110, Raw Interest Score: 1.9648,
Positive Sentiment: 0.1551, Negative Sentiment 0.2068

Alation and Databricks Accelerate Data Discovery and Cloud Data Migration

Alation, provider of data catalog software, is partnering with Databricks, provider of a unified analytics platform for data and AI, to help accelerate data science-led innovations. According to the companies, a new integration provides data teams with a platform to identify and govern cloud data lakes, discover and leverage the best data for data science and analytics, and collaborate on data to deliver high quality predictive models and business insights.

By identifying the most widely used assets, Alation enables data teams to prioritize data for migration to the cloud. Once in the cloud, Alation provides data teams with visibility into the assets residing in the data lake and allows for context and understanding of the data, as well as collaboration among subject matter experts.

2020-06-17 00:00:00 Read the full story…
Weighted Interest Score: 4.9172, Raw Interest Score: 2.1222,
Positive Sentiment: 0.5694, Negative Sentiment 0.1035

Microsoft acquires ADRM Software, leader in large-scale, industry-specific data models

In advancing our mission to empower every person and organization on the planet to achieve more, Microsoft has been investing in the power of data and artificial intelligence (AI) to continuously innovate, influence and enhance customer experience and partner growth.

Data and AI are the foundation of modern technological innovation, yet businesses today struggle to unlock the full value data has to offer as fragmented data estates hinder digital transformation. Without a comprehensive and integrated view of their data, companies are at a competitive disadvantage, which hinders digital adoption and data-driven innovation.

Today, we are excited to announce the acquisition of ADRM Software, a leading provider of large-scale industry data models, which are used by large companies worldwide as information blueprints. ADRM’s robust industry data models have been built and refined over decades for business-critical analytics.

2020-06-18 00:00:00 Read the full story…
Weighted Interest Score: 4.2753, Raw Interest Score: 2.3952,
Positive Sentiment: 0.5988, Negative Sentiment 0.2139

COVID-19 Gives AI a Reality Check

While it seems unlikely that AI will enter another nuclear winter, the current COVID-19 situation is giving enterprises the opportunity to rethink their AI strategies, giving the better AI projects more room to run, while discarding the borderline AI projects that were unlikely to pay off.

The macro economic situation deteriorated rapidly thanks to COVID-19. In the span of a few weeks in late March, the United States went from record-low unemplo…
2020-06-18 00:00:00 Read the full story…
Weighted Interest Score: 4.2218, Raw Interest Score: 1.3963,
Positive Sentiment: 0.3103, Negative Sentiment 0.4893

Operationalizing of Machine Learning Data (Video Behind Registration Wall)

A challenge of ML is operationalizing the data volume, performance, and maintenance. In this session, Rashmi Gupta explains how to use tools for orchestration and version control to streamline datasets. She also discusses how to secure data to ensure that production control access is streamlined for testing.
2020-06-16 00:00:00 Read the full story…
Weighted Interest Score: 4.2071, Raw Interest Score: 1.6181,
Positive Sentiment: 0.0000, Negative Sentiment 0.3236

Hands-On Guide to Predict Fake News Using Logistic Regression, SVM and Naive Bayes Methods

There are more than millions of news contents published on the internet every day. If we include the tweets from twitter, then this figure will be increased in multiples. Nowadays, the internet is becoming the biggest source of spreading fake news. A mechanism is required to identify fake news published on the internet so that the readers can be warned accordingly. Some researchers have proposed the methods to identify fake news by analyzing the text data of the news based on the machine learning techniques. Here, we will also discuss the machine learning techniques that can identify fake news correctly.

In this article, we will train the machine learning classifiers to predict whether given news is real news or fake news. For this task, we will train three popular classification algorithms – Logistics Regression, Support Vector Classifier and the Naive-Bayes to predict the fake news. After evaluating the performance of all three algorithms, we will conclude which among these three is the best in the task.
2020-06-22 09:30:00+00:00 Read the full story…
Weighted Interest Score: 4.0722, Raw Interest Score: 1.6475,
Positive Sentiment: 0.0960, Negative Sentiment 0.0800

How to make your ML algorithms think like a human

The finance industry is a prime use case for machine learning, thanks to the abundant data sets, access to capital and strong incentive for efficiency and predicting future outcomes. While rule-based workflows are well embedded within the industry, many businesses are now turning to machine learning to automate the algorithm building process, especially when it comes to fintech.

As digital services become more widespread, financial organisations need to move beyond rule-based mechanisms and manual data analysis to ensure compliance, security and customer service. Machine learning is more scalable, flexible and reliable when implemented properly, but requires the right data to deliver actionable insights.

This is especially the case when it comes to making predictions about human behaviour. At a recent developer meetup, I heard from Ben Houghton, Head of Data Science for Barclays Payments, about his data approach and how he makes his algorithms think like a human.

2020-06-18 16:24:58 Read the full story…
Weighted Interest Score: 4.0388, Raw Interest Score: 1.7202,
Positive Sentiment: 0.1811, Negative Sentiment 0.2037

Scaling data science to create new business value (Video behind Registration Wall)

With businesses facing economic uncertainty, the potential of AI at scale is no longer a goal. It is an essential business priority. This is why Avanade and Microsoft have teamed up to power advanced analytics with Azure Synapse. Learn the four questions you should ask yourself to uncover the value of your data at scale.
2020-06-17 00:00:00 Read the full story…
Weighted Interest Score: 4.0373, Raw Interest Score: 1.5528,
Positive Sentiment: 0.0000, Negative Sentiment 0.6211

Oil & Gas Industry Transforming Itself with the Help of AI

The oil and gas industry is turning to AI to help cut operating costs, predict equipment failure, and increase oil and gas output.

A faulty well pump at an unmanned platform in the North Sea disrupted production in early 2019 for Aker BP, a Norwegian oil company, according to an account in the Wall Street Journal. The company installed an AI program that monitors data from sensors on the pump, flagging glitches before they can cause a shutdown, stated Lars Atle Andersen, VP of operations for the firm. Now he flies in engineers to fix such problems ahead of time and prevent a shutdown, he stated.
2020-06-18 21:30:16+00:00 Read the full story…
Weighted Interest Score: 3.9699, Raw Interest Score: 1.6643,
Positive Sentiment: 0.1884, Negative Sentiment 0.2355

The Need For A Data-Centric Approach To Compliance: Report

SteelEye, the compliance technology and data analytics firm, today published “Data-Driven Financial Services Compliance – Understanding the Opportunity”, a white paper which explores the key challenges faced by compliance teams within financial markets as they navigate regulatory change.

Increased complexity, rising costs and significant financial, operational and reputational risk have accompanied the wave of new regulations implemented over the past decade. Add to that the pandemic which brought with it market volatility and an exponential increase in the number of market abuse alerts, and financial compliance has become even more complex.
2020-06-18 13:37:54+00:00 Read the full story…
Weighted Interest Score: 3.8676, Raw Interest Score: 1.8068,
Positive Sentiment: 0.3127, Negative Sentiment 0.1737

Staying On Top of ML Model and Data Drift

A lot of things can go wrong when developing machine learning models. You can use poor quality data, mistake correlation for causation, or overfit your model to the training data, just to name a few. But there are also a few gotchas that data scientists need to look out for after the models have been deployed into production, specifically around model and data drift.

Data scientists pay close attention to the data they use to train their machine learning models, as they should. Machine learning models, after all, are simply functions of data. But the work is not over once the models are put into production, as data scientists must monitor the models to be sure they’re not drifting.

2020-06-16 00:00:00 Read the full story…
Weighted Interest Score: 3.5581, Raw Interest Score: 2.0420,
Positive Sentiment: 0.1789, Negative Sentiment 0.2534

The revised pessimistic projection for Digital wealth AUM does not make sense

Efi Pylarinou is the founder of Efi Pylarinou Advisory and a Fintech/Blockchain influencer – No.3 influencer in the finance sector by Refinitiv Global Social Media 2019.

Consulting practices call for 5yr predictions on all sorts of topics. The so-called Robo Advisor subsector in investing has not escaped these studies.

Back in 2016, was when Vanguard was making its first leapfrogging attempts in a space that Betterment and Wealthfront had broug…
2020-06-16 00:00:00 Read the full story…
Weighted Interest Score: 3.4815, Raw Interest Score: 1.6380,
Positive Sentiment: 0.0642, Negative Sentiment 0.0482

How to Create a Linear Regression Model

You can perform predictive modeling in Excel in just a few steps. Here’s a step-by-step tutorial on how to build a linear regression model in Excel and how to interpret the results

Excel for predictive modeling? Really? That’s typically the first reaction I get when I bring up the subject. This is followed by an incredulous look when I demonstrate how we can leverage the flexible nature of Excel to build predictive models for our data science and analytics projects. Let me ask you a question – if the shops around you started collecting customer data, could they adopt a data-based strategy to sell their goods? Can they forecast their sales or estimate the number of products that might be sold?

Now you must be wondering how in the world will they build a complex statistical model that can predict these things? And learning analytics or hiring an analyst might be beyond their scope. Here’s the good news – they don’t need to. Microsoft Excel offers us the ability to conjure up predictive models without having to write complex code that flies over most people’s heads.

2020-06-21 19:28:53+00:00 Read the full story…
Weighted Interest Score: 3.4720, Raw Interest Score: 1.7928,
Positive Sentiment: 0.2255, Negative Sentiment 0.0677

Dream of Becoming a Big Data Engineer? Discover What Sets Us Apart From Software Engineers

Engineering is an essential element of all corporations. Without it, companies are unable to create, maintain, and upgrade their products. Technology enterprises rely on their engineering department to survive in a competitive world. Even so, not all engineers perform the same set of tasks. In heavy technology-based companies, software engineers are one of the most critical resources. They build programs, create software, and maintain the functionality of the systems. Many other career paths diverge from software engineering. They specialize in a particular subject. In the data landscape, corporations face a tremendous growth in data amount. We need someone to step up and claim the responsibilities of managing that data. That commences the dawn of big data engineers. A big data engineer can evolve from a database administrator, a data architect, or a data analyst.

2020-06-20 17:45:51.905000+00:00 Read the full story…
Weighted Interest Score: 3.4175, Raw Interest Score: 1.8887,
Positive Sentiment: 0.1437, Negative Sentiment 0.1437

5 Tips for Kickstarting Your Data Career

Four years ago, I was a recent college grad, starting out my career at a four-person IoT startup. One of my first assignments was to research and propose a solution for an AI-based digital assistant for military settings. Although I studied engineering in college and worked in a lab assisting machine learning research, undertaking a huge natural language processing project without an experienced data scientist/engineer in-house was a daunting task. Inevitably, I had to resort to online resources to fill in the gaps and find mentors outside the organization for direction as well as personal growth.

Fast forward to the present, I now work on the data infrastructure for our IoT platform and train fullstack engineers and product managers in the company about data science to analyze our massive IoT data. This post is a compilation of all the resources I used and the tips I learned over the years of growing my own career in data science/engineering. Whether you are an engineer looking to break into the data industry or a recent grad preparing for your new role, I hope you find my tips useful.

2020-06-22 02:45:43.971000+00:00 Read the full story…
Weighted Interest Score: 3.2220, Raw Interest Score: 1.3601,
Positive Sentiment: 0.2218, Negative Sentiment 0.1331

AI Adoption Spurs Efforts to Reskill the Workforce

As AI adoption brings out changes in the workplace, workers are challenged to obtain needed AI skills and business leaders are working to adapt.

And as the COVID-19 pandemic has led to a shift to online learning, companies such as Udacity—who have been in that business for years—are in a good position to help.

Business leaders may be caught between competing objectives of continuing to deliver strong financial performance while making investments in hiring, workforce training and new technologies that support growth, suggested the author of a recent piece in Harvard Business Review.

2020-06-18 21:30:13+00:00 Read the full story…
Weighted Interest Score: 3.2167, Raw Interest Score: 1.4864,
Positive Sentiment: 0.2713, Negative Sentiment 0.1416

Game-Changing Technologies in the Data Environment of 2020

AI and machine learning were cited by several industry leaders as the most important technologies shaping today’s data environments. “We’re starting to see more success in specific use cases of machine learning, such as anomaly detection with system events, natural language processing, entity extraction, and classification technologies,” said Ranga Rajagopalan, vice president of product management for Commvault.

AI is critical to competing in the emerging economy, as it “makes it possible to go beyond what the human eye can detect and focus on a range of bad behaviors,” said David Ngo, vice president of product and engineering at Metallic. “It helps predict, identify, address, and solve our data needs.”

AI and automation are making IT and data professionals’ roles easier as well—enabling automatic processing of billions of dependencies in real time, continuous monitoring of the full stack for system degradation and performance anomalies, and delivering precise answers prioritized by business impact, said Jakub Mierzewski, product manager at Dynatrace. “With the right AI and automation technologies and practices in place, teams can shift from reactive to proactive, from guessing to knowing, from sifting through logs or becoming tied up in war rooms to having deep insights and data that drive innovation, acceleration, and business value. It’s like having an entire new team working for you 24×7, allowing your people to focus on what really matters.”

2020-06-17 00:00:00 Read the full story…
Weighted Interest Score: 3.2142, Raw Interest Score: 1.7033,
Positive Sentiment: 0.2241, Negative Sentiment 0.2689

Nebula Graph Joins Database Race

As the open source Nebula Graph database moves closer to commercial availability, the technology’s developer has announced an early funding round led by several Chinese investors. VEsoft Inc. said this week it would use the $8 million round to bring Nebular Graph to the European and North American markets as well as the rest of Asia.

Nebula Graph was released as an open source project in May 2019. The first beta version was released last June. The startup is poised to move from its beta version to general availability, addressing a market estimated by Gartner as growing to as much as $10 billion over the next several years. Competing graph database startups have so far raised more than $130 million in venture funding. Among them is TigerGraph, which announced a $32 million Series B funding round last fall as it released a cloud-based graph analytics service. Nebula Graph will be offered as a cloud service, VEsoft said. Monday (June 15).

2020-06-15 00:00:00 Read the full story…
Weighted Interest Score: 3.1643, Raw Interest Score: 1.7217,
Positive Sentiment: 0.0465, Negative Sentiment 0.0000

Defensive or Offensive, Every Strategy Must Start With Trust

As digital transformation becomes mainstream, digitization is no longer a differentiating advantage. Enterprises must answer to a new set of expectations from customers, employees and business partners, and all while prioritizing compliance with tightening data regulations. To ensure they aren’t hindered by bad data – or the inability to leverage good data – companies must balance both offensive and defensive strategies.

This two-pronged approac…
2020-06-17 11:00:00+00:00 Read the full story…
Weighted Interest Score: 3.1242, Raw Interest Score: 1.6649,
Positive Sentiment: 0.2775, Negative Sentiment 0.4096

Privacera’s Latest Release Integrates with Databricks to Offer Robust Governance

Privacera, a cloud data governance and security leader founded by the creators of Apache Ranger, is releasing the latest version of the Privacera Platform, an enterprise data governance and security solution for machine learning and analytic workloads in the public cloud.

Leveraging the Apache Ranger architecture, the Privacera Platform integrates with Databricks to help ensure consistent governance, security, and compliance across all data science, machine learning, and analytics workloads.

Privacera provides secure data sharing across the enterprise and balances the competing mandates of data democratization while adhering to applicable privacy and industry regulations such as GDPR and CCPA.

2020-06-17 00:00:00 Read the full story…
Weighted Interest Score: 2.9148, Raw Interest Score: 1.6622,
Positive Sentiment: 0.3145, Negative Sentiment 0.0000

Expanding Your Data Science and Machine Learning Capabilities (Registration Wall)

Surviving and thriving with data science and machine learning means not only having the right platforms, tools and skills, but identifying use cases and implementing processes that can deliver repeatable, scalable business value. The challenges are numerous, from selecting data sets and data platforms, to architecting and optimizing data pipelines, and model training and deployment. In responses, new solutions have emerged to deliver key capabilities in areas including visualization, self-service and real-time analytics. Along with the rise of DataOps, greater collaboration and automation have been identified as key success factors.

2020-06-25 00:00:00 Read the full story…
Weighted Interest Score: 2.8463, Raw Interest Score: 1.8130,
Positive Sentiment: 0.2863, Negative Sentiment 0.0954

The firms floating now offer clues for canny investors

After a lockdown hiatus, the IPO market has returned with a bang. This week saw success for US listings in biotech and pharmaceuticals – and news is expected in the next few weeks on major flotations in cloud computing, big data and artificial intelligence. This forthcoming wave of IPOs, with boards prepared to pull the trigger on a stock market listing despite an ongoing pandemic, is highlighting areas that are likely to do well whatever happens with Covid-19.

Also in the pipeline is artificial intelligence (AI) and behavioural economics group Lemonade, which is a fintech group with big hopes of disrupting the insurance sector. The company uses data analysis and an AI chat bot called Maya to calculate insurance rates for homeowners and renters – cutting costs and saving money for insurance issuers and customers alike.

2020-06-19 00:00:00 Read the full story…
Weighted Interest Score: 2.8415, Raw Interest Score: 1.3122,
Positive Sentiment: 0.1093, Negative Sentiment 0.2734

Evolution of data science: How it will change over the next decade

Although data science, as an academic discipline, has been around for more than 50 years, it wasn’t until around 2010 that it entered the mainstream consciousness. It happened as a new wave of businesses recognized that data was the key to mastery of modern markets and started making it their strategic focus. In the years since, the field of data science has seen explosive growth as well as some fast-paced developments as higher demand has spurred innovation.

As far as the field of data science has come since 2010, there’s every reason to believe that the next decade will bring even more change. With simultaneous advances in related technology fields and new approaches by the best and brightest minds in the industry, data science in 2030 will bear little resemblance to the state of the art today. Here’s a look at how data science is set to evolve over the next decade.

2020-06-16 10:33:57+00:00 Read the full story…
Weighted Interest Score: 2.8079, Raw Interest Score: 1.4761,
Positive Sentiment: 0.2312, Negative Sentiment 0.1601

Lenovo Announces Solutions Purpose-Built For Analytics And AI Workloads

Lenovo Data Center Group (DCG) announced the launch of the ThinkSystem SR860 V2 and SR850 V2 servers, which now features 3rd Gen Intel Xeon Scalable processors with enhanced support for SAP HANA based on Intel Optane persistent memory 200 series. These solutions will allow customers to simplify common data management challenges. In addition, Lenovo announced new remote deployment service offerings for the ThinkSystem DM7100 storage systems.

With these offerings, customers can more easily navigate complex data management needs to deliver actionable business intelligence through artificial intelligence (AI) and analytics while getting maximum results when combined with business applications like SAP HANA®.

2020-06-19 07:31:41+00:00 Read the full story…
Weighted Interest Score: 2.8055, Raw Interest Score: 1.6920,
Positive Sentiment: 0.4999, Negative Sentiment 0.2115

AI2 spinout Lexion unveils Slack chatbot that automatically finds legal contracts

Lexion, a Seattle startup that spun out of the Allen Institute for Artificial Intelligence (AI2), this week rolled out a new Slack chatbot that can instantly find legal contracts. The chatbot analyzes a request via Slack and locates relevant information based on viewing permissions that a legal team has set.

“This not only helps in-house legal teams save hours each week on fishing contracts for sales/support/biz dev/executives, but is especially valuable now as teams work remotely with Slack as their main means of communication,” said Lexion CEO Gaurav Oberoi. It’s another example of companies adding features as the economic crisis changes how their customers work. Fellow Seattle startup Uplevel last week rolled out new tools to help managers measure engineer productivity in remote work settings.

2020-06-18 21:00:00+00:00 Read the full story…
Weighted Interest Score: 2.6658, Raw Interest Score: 1.5058,
Positive Sentiment: 0.0684, Negative Sentiment 0.0684

Staid Insurance Industry Exploring AI With Some Caution

Insurance industry taking careful steps in exploring AI for usage-based insurance, deep personalization, faster claims settlements.

The insurance industry is dominated by massive national brands and legacy product lines that have remained largely unchanged for decades. It is a staid industry. This makes the industry ripe for disruption by new technologies and approaches, especially those enabled by AI.

Venture capitalists see an opportunity and are investing. New York-based Lemonade, started in 2015, has attracted $480 million in funding so far, according to Crunchbase. Lemonade, which started in homeowner and renter’s insurance, recently filed to go public. Released financial information shows the company has a way to go to become profitable.

Auto insurance, which makes up more than 40 percent of the overall business, is likely to shrink as self-driving cars come onto the roads and fulfill their promise of making driving safer, suggested a KPMG report in 2015. The consultants predicted the auto insurance market will shrink 60 percent over the next 25 years.

2020-06-19 19:34:51+00:00 Read the full story…
Weighted Interest Score: 2.4238, Raw Interest Score: 1.2302,
Positive Sentiment: 0.2139, Negative Sentiment 0.2496

Roll Call: Visual Graph Data Models Today

Five years ago, I wrote a book about a new approach to Data Modeling — one that “turns the inside out.” It discussed visual Graph Data Modeling. For well over 30 years, relational modeling and normalization were the name of the game. One could ask that if normalization was the answer, what was the problem? But there is something upside-down in that approach.

Data analysis (and modeling) is much like exploration — almost literally. The data modeler wanders around searching for structure and content. This requires perception and cognitive skills, supported by intuition (a psychological phenomenon), that, together, will determine how well the landscape of business semantics is mapped.

Mapping is what we do; we explore the unknowns, draw the maps, and post the “Here be dragons” warnings. Of course, there are technical skills involved, and, surprisingly, the most important ones come from psychology and visualization (i.e., perception and cognition) rather than pure mathematical ability. Think of concept maps versus UML. And think of graphs versus SQL.

2020-06-22 07:35:29+00:00 Read the full story…
Weighted Interest Score: 2.3390, Raw Interest Score: 1.4204,
Positive Sentiment: 0.2007, Negative Sentiment 0.0309

AUC-ROC Curve in Machine Learning Clearly Explained

AUC-ROC Curve – The Star Performer!
You’ve built your machine learning model – so what’s next? You need to evaluate it and validate how good (or bad) it is, so you can then decide on whether to implement it. That’s where the AUC-ROC curve comes in.

The name might be a mouthful, but it is just saying that we are calculating the “Area Under the Curve” (AUC) of “Receiver Characteristic Operator” (ROC). Confused? I feel you! I have been in your shoes. But don’t worry, we will see what these terms mean in detail and everything will be a piece of cake!

For now, just know that the AUC-ROC curve helps us visualize how well our machine learning classifier is performing. Although it works for only binary classification problems, we will see towards the end how we can extend it to evaluate multi-class classification problems too.

We’ll cover topics like sensitivity and specificity as well since these are key topics behind the AUC-ROC curve.

2020-06-15 19:59:46+00:00 Read the full story…
Weighted Interest Score: 2.2828, Raw Interest Score: 1.0438,
Positive Sentiment: 0.4234, Negative Sentiment 0.5515

Blurred Lines: SAS and Microsoft To Go Deep in Analytics Partnership

The lines separating SAS and Microsoft analytics and AI software will blur as part of a major strategic expansion between the two companies announced today that will see Azure become the preferred cloud for SAS and technical integration across their respective product lines in the years ahead.

As part of the partnership, SAS has picked Microsoft Azure as the preferred provider for SAS Cloud, its suite of managed analytic and AI offerings. The company will begin the process of migrating SAS Cloud customers and offerings to Azure soon.

This deal is not exclusive, says SAS Executive Vice President and CIO Jay Upchurch. The SAS software will continue to be cloud agnostic, and customers will have the choice to run it on any cloud they want. “However, over the years ahead, SAS will migrate our internal operation and our global SAS Cloud business to Microsoft Azure,” he says.

2020-06-15 00:00:00 Read the full story…
Weighted Interest Score: 2.2790, Raw Interest Score: 1.1395,
Positive Sentiment: 0.2779, Negative Sentiment 0.0556

A former Google X employee just raised $21 million for his AI startup Streamlit, which is already being used by companies like Uber and Stitch Fix

On Tuesday, the artificial intelligence startup Streamlit announced it raised $21 million in Series A funding.

While still at Alphabet’s research subsidiary Google X (now known as X), Streamlit CEO and co-founder Adrien Treuille worked on all sorts of projects, from self-driving cars to Google Glass. But during this time, he first started to see a problem: Engineers building artificial intelligence products like a self-driving car constantly faced ever-growing mountains of data, and they often had difficulty working with all that.

He started building software to address the bottleneck, making it easier for data scientists and developers to build AI apps and experiment with machine learning, a field of AI that enables computer programs to learn from data, identify patterns, and make decisions on their own, without being told what to do. Machine learning is used in self-driving cars, email spam filters, mapping, and more. What started as a personal project ended up getting used by Uber and Stitch Fix. The next thing Treuille knew, investors wanted to put money into it. And on Tuesday, Streamlit announced it raised $21 million in Series A funding led by Gradient Ventures and GGV Capital.

2020-06-16 00:00:00 Read the full story…
Weighted Interest Score: 2.2104, Raw Interest Score: 1.3626,
Positive Sentiment: 0.1065, Negative Sentiment 0.1916

FogHorn Introduces Workplace Safety Solution that Leverages AI and IoT Data

FogHorn, a developer of Edge AI software for industrial and commercial Internet of Things (IoT) solutions, is creating Lightning Health & Safety Solutions, aiming to improve the safety of workplaces and help mitigate the spread of contagious illnesses.

Lightning Solutions, a new product line from FogHorn, are out-of-the-box packages of FogHorn’s Lightning Edge AI platform, preconfigured with use-case specific machine learning models and visualization dashboards.

Out-of-the-box solutions allow organizations to rapidly deploy edge intelligence and AI and immediately derive insights to common problems.

The FogHorn Lightning Health & Safety Solution suite includes a range of out-of-the-box solutions that can be used individually or together to create a comprehensive system.

An enterprise edition of the solutions is also available that can include further customizations, data science and integrations with customer’s existing IT systems, video management software, and access control systems.

Solutions include:

  • Health Monitoring: elevated temperature detection, cough detection, hand washing monitoring, social distancing monitoring, and mask / facial covering detection
  • Safety Monitoring: personal protective equipment, including hard hats, footwear, eyewear, vests, and boots
  • Hazard Detection: custom solution engagements are also available including crane and falling debris warnings, leak detection and spill hazards

2020-06-16 00:00:00 Read the full story…
Weighted Interest Score: 2.0826, Raw Interest Score: 1.1858,
Positive Sentiment: 0.0719, Negative Sentiment 0.2515

Challenges Data Teams Face While Working Remotely

Data science is a domain where working from home needs specific conditions, including the type of projects, access to tools, kind of tasks, staff engagement, and connectivity, and collaboration with the rest of the team/company. But such factors alone can be the source of problems for a data science team to be productive and efficient.

While you may think that not much has to change, and data science professionals can work smoothly from their homes, it may not be the case at all. According to AIM Research, 34% of analytics have reported a negative impact on their productivity due to work from home scenarios. The fact of the matter is that a lot of teams which are new to being remote and may face a host of unforeseen challenges.

2020-06-22 12:24:52+00:00 Read the full story…
Weighted Interest Score: 2.0739, Raw Interest Score: 1.1919,
Positive Sentiment: 0.3337, Negative Sentiment 0.6675

Zeroth-Order Optimisation And Its Applications In Deep Learning

Deep learning applications usually involve complex optimisation problems that are often difficult to solve analytically. Often the objective function itself may not be in analytically closed-form, which means that the objective function only permits function evaluations without any gradient evaluations. This is where Zeroth-Order comes in.

Optimisation corresponding to the above types of problems falls into the category of Zeroth-Order (ZO) optimisation with respect to the black-box models, where explicit expressions of the gradients are hard to estimate or infeasible to obtain. Researchers from IBM Research and MIT-IBM Watson AI Lab discussed the topic of Zeroth-Order optimisation at the on-going Computer Vision and Pattern Recognition (CVPR) 2020 conference.

In this article, we will take a dive into what Zeroth-Order optimisation is and how this method can be applied in complex deep learning applications.

2020-06-21 04:30:00+00:00 Read the full story…
Weighted Interest Score: 2.0614, Raw Interest Score: 1.5796,
Positive Sentiment: 0.2228, Negative Sentiment 0.5265


This news clip post is produced algorithmically based upon CloudQuant’s list of sites and focus items we find interesting. We used natural language processing (NLP) to determine an interest score, and to calculate the sentiment of the linked article using the Loughran and McDonald Sentiment Word Lists.

If you would like to add your blog or website to our search crawler, please email customer_success@cloudquant.com. We welcome all contributors.

This news clip and any CloudQuant comment is for information and illustrative purposes only. It is not, and should not be regarded as investment advice or as a recommendation regarding a course of action. This information is provided with the understanding that CloudQuant is not acting in a fiduciary or advisory capacity under any contract with you, or any applicable law or regulation. You are responsible to make your own independent decision with respect to any course of action based on the content of this post.