Alternative Data News. 01, July 2020

The AltDataNewsletter by CloudQuant

Finding sources and uses for alternative data can be difficult. At CloudQuant we regularly read and search the internet for new sources of data that can be used in our mission to find alpha signals and build quantitative trading strategies. We recognize that we are technology and data junkies so we wrote our own crawler that specifically seeks out web pages, posts, and news articles that give us a snapshot of what is going on in the world of Alt Data. The following is a collection of articles that we think you will find interesting from the past week.


The Major US Airlines’ Number of Flights Per Day

Source: Flightradar24.com

Tools used: matplotlib

This visualization was created for city-data.com

Their API is accessible using pyflightdata. You can list planes (tail numbers) by airline as well as get flight history for each tail number.

Q : How did you do it then? Did you search for all the flight numbers operated by a particular airline, then search for that flight information?

A : Yes, that’s a few lines of python code. The only issue is you have to throttle requests – so querying all the planes takes several hours.

2020-06-26 Read the Full Story…

CloudQuant Thoughts : Another great post from Data Is Beautiful over at Reddit.

Almost 17,000 Protesters Had No Idea A Tech Company Was Tracing Their Location

Data company Mobilewalla used cellphone information to estimate the demographics of protesters. Sen. Elizabeth Warren says it’s “shady” and concerning.

On the weekend of May 29, thousands of people marched, sang, grieved, and chanted, demanding an end to police brutality and the defunding of police departments in the aftermath of the police killings of George Floyd and Breonna Taylor. They marched en masse in cities like Minneapolis, New York, Los Angeles, and Atlanta, empowered by their number and the assumed anonymity of the crowd. And they did so completely unaware that a tech company was using location data harvested from their cellphones to predict their race, age, and gender and where they lived.

Just over two weeks later, that company, Mobilewalla, released a report titled “George Floyd Protester Demographics: Insights Across 4 Major US Cities.” In 60 pie charts, the document details what percentage of protesters the company believes were male or female, young adult (18–34); middle-aged 35º54, or older (55+); and “African-American,” “Caucasian/Others,” “Hispanic,” or “Asian-American.”

“These companies can even sell this data to the government, which can use it for law and immigration enforcement.”
“African American males made up the majority of protesters in the four observed cities vs. females,” Mobilewalla claimed. “Men vs. women in Atlanta (61% vs. 39%), in Los Angeles (65% vs. 35%), in Minneapolis (54% vs. 46%) and in New York (59% vs. 41%).” The company analyzed data from 16,902 devices at protests — including exactly 8,152 devices in New York, 4,527 in Los Angeles, 2,357 in Minneapolis, and 1,866 in Atlanta.

Sen. Elizabeth Warren told BuzzFeed News that Mobilewalla’s report was alarming, and an example of the consequences of the lack of regulation on data brokers in the US.

2020-06-25 Read the full story…

CloudQuant Thoughts : We often forget how little most people know about the data collected by firms like Mobilewalla. When the extent of their data collection shocks even data scientists then you know someone has overstepped the mark.

Quants Sound Warning as Everyone Chases Same Alternative Data

Quant and discretionary practitioners warn pandemic markets are shining a light on two big pitfalls: Investing signals are hard to find in the noise and, when money mangers do strike gold, excess returns can vanish quickly.

“I’ve looked at probably 700 or 800 data sets over the last 10 years and about 90 to 95% of data sets tend to have basic evident biases to them,” said Qaisar Hasan, a fund manager at Lombard Odier Investment Managers in New York. “They don’t really deliver the claims the vendor has made.”

Like many of his peers, Hasan is using Apple and Google’s mobility statistics to help map the economic recovery as virus restrictions ease. But he warns those venturing deeper into the data world to tread carefully. A set of credit-card data might be skewed toward one demographic that isn’t receiving stimulus checks, or to a specific region of the U.S. which is unrepresentative of broader trends, for instance.

2020-06-25 04:35:54-04:00 Read the full story…
2020-06-25 12:01:05+00:00 Read the full story…
Weighted Interest Score: 5.0552, Raw Interest Score: 1.9180,
Positive Sentiment: 0.2166, Negative Sentiment 0.1701

CloudQuant Thoughts : Finding useful datasets that are a) not garbage! b) do not have bias c) still contain useful alpha and d) are easy to test and ingest still remains this industry’s number one challenge. At CloudQuant we are constantly working to make locating and testing alternative data as easy as test driving a car.

BlackRock Adopts New Investing Framework Due to Pandemic

BlackRock has developed a “completely new macro framework” for investing as a result of the coronavirus shock to the global economy and global financial markets.

“We used to frame things as to where we were in the business cycle,” said Jean Boivin, head of the BlackRock Investment Institute, who discussed the firm’s midyear investment outlook in a webinar. “That is not the story anymore. The shock [of the pandemic] has fundamentally changed the investment environment and landscape … [and] that requires a deeper rethink of how we build portfolios.”
Continuing that theme, Elga Bartsch, who heads up economic and markets research at the BlackRock Institute, said the current global economic downturn “is not a recession” and its reversal will not be a recovery but “a restart” of the economy, which has strategic (longer term) and tactical (short term) implications for investors.

2020-06-30 00:00:00 Read the full story…
Weighted Interest Score: 2.7607, Raw Interest Score: 1.4817,
Positive Sentiment: 0.0228, Negative Sentiment 0.2507

CloudQuant Thoughts : The tragic number of deaths is obviously the major impact caused by the Corona Virus, but a very close second and some would claim larger impact is the massive financial intervention in the market by the FED. Without the FED pumping trillions of dollars into Bonds and ETFs the market would have collapsed. They did not execute this rescue alone, BlackRock assisted. And whilst the press claims that BlackRock made relatively modest fees from this assistance. We in data know that being on the inside and knowing what was happening when, with presumably access to the exact trades, puts BlackRock at a tremendous advantage when it comes to “Predicting” the future direction of the economy and the market.

Analytics Best Practices for Transforming Data into a Business Asset

Data has three main functions that provide value to the business: To help in business operations, to help the company stay in compliance and mitigate risk, and to make informed decisions using analytics.

“Data can have an impact on your top line as well as your bottom line,” said Dr. Prashanth Southekal, CEO of DBP-Institute in a recent interview with DATAVERSITY®.

“Just capturing, storing, and processing data will not transform your data into a business asset. Appropriate strategy and the positioning of the data is also required,” he said. Southekal shared best practices for analytics and ways to transform data into an asset for the business.

2020-06-23 07:35:28+00:00 Read the full story…
Weighted Interest Score: 2.7507, Raw Interest Score: 1.6859,
Positive Sentiment: 0.4141, Negative Sentiment 0.1923

CloudQuant Thoughts : If you have data the you think may be of interest to market professionals, let CloudQuant help you get the data in front of the right people in the right format. Get in touch!

Research Reveals Shortcomings in Data Literacy Projects: Don’t Let This Happen to You

The studies cited in this report indicate that: “Only 32 percent of business executives surveyed said that they’re able to create measurable value from data, while just 27 percent said their data and analytics projects produce actionable insights. Organizations need to recognize that the exponential growth in data usage has accelerated far beyond the skills and confidence of the employees required to use it. Only 25 percent of employees felt fully prepared to use data effectively when entering their current role.”

It then went on to say later in the study that: “Despite nearly all employees recognizing data in the workplace as an asset, few are using it to inform decision-making. Only 37 percent of employees trust their decisions more when those decisions are based on data, and almost half (48 percent) frequently defer to making decisions based on gut feeling over data-driven insight.”

2020-07-01 07:35:30+00:00 Read the full story…
Weighted Interest Score: 2.5993, Raw Interest Score: 1.5194,
Positive Sentiment: 0.3575, Negative Sentiment 0.1639

CloudQuant Thoughts : Remember when listening to business managers quoting statistics that Jack Ma led one of the largest tech firms in the world, invested in AI and is estimated to be worth more than $43b.


ESG Section

CloudQuant also provides Alternative Data sets together with analysis in the form a a white paper, code and data to reproduce the results in the white paper. Head over to our Data Catalog to find out more.

How to build an investment portfolio that supports racial justice

Measuring the social impact of your stocks and bonds is not always easy, but there are still many tools to help you promote racial equity with your investments

In the wake of widespread outrage and protests about racial injustice, many people are looking at their stock portfolio and wondering: what can I do to support racial justice with my dollars? If you are an investor of any type — whether you have a 401(k), IRA, or trading account — there are a few things you could do to promote racial equity.

ESG investing: the basics : You may have heard of ESG investing, which stands for “environmental, social and governance.” It is also often called sustainable, socially responsible or simply “values” investing. It’s an investment strategy that selects stocks and bonds based not only on traditional financial criteria, but also based on the impact of different companies on society and the environment.

2020-06-29 00:00:00 Read the full story…
Weighted Interest Score: 2.6584, Raw Interest Score: 1.4456,
Positive Sentiment: 0.1268, Negative Sentiment 0.1268

Activist Hedge Funds Can Smell Greenwashing, Study Finds

Hedge funds are going after firms that announce environmental, social, or governance plans — but not the ones that take them seriously.

Companies implementing social responsibility plans are twice as likely to enter activist hedge funds crosshairs as firms that are not addressing these issues. But management teams that are truly serious, not just greenwashing, about environmental, governance and other impact goals, may be able to avoid luring activists, according to new academic research.

Investors are increasingly deploying money via ESG and impact frameworks. Even Jeff Ubben, founder of $16 billion ValueAct Capital, is quitting to start an impact fund. Skeptics have long believed that a financial crisis would reduce the amount of attention paid to what are often considered soft issues like board diversity or the environmental impact of manufacturing plants. But investors have actually doubled down on ESG strategies since the pandemic shut down economies in March.

For companies wanting to get in on those capital flows (or do the right thing), the new study sheds light on how activists may react to ESG initiatives.

2020-06-25 Read the full story…

Carbon Transition Is ‘Extraordinary’ Opportunity

David Blood, co-founder and senior partner of Generation Investment Management, said the transition to a low-carbon economy presents ‘extraordinary’ economic opportunities which he has not seen before in his 35 years in finance.

He spoke this morning on a webinar, Investors as catalysts of the climate transition, hosted by the London Stock Exchange Group and the United Nations-backed Principles of Responsible Investment.

“People are recognising the link between sustainability, inequality and resilience,” Blood said. “Investors are insisting that climate change and social justice should be addressed when we build back better after Covid-19. The economic opportunities are extraordinary, which I have not seen in my 35 years of finance,”

2020-06-30 17:23:13+00:00 Read the full story…
Weighted Interest Score: 2.7551, Raw Interest Score: 1.6150,
Positive Sentiment: 0.1900, Negative Sentiment 0.0760

Why Jeff Bezos is pouring billions into tackling climate change

Amazon is making much of its efforts to tackle climate change. But what does it stand to gain?

Jeff Bezos wants you to know that Amazon is serious about tackling climate change. In the space of four days last week, his company launched a $2bn (£1.6bn) venture capital fund to invest in technologies that tackle carbon emissions, bought an electric self-driving car firm and revealed that it would rename a Seattle hockey stadium to the “Climate Pledge Arena”….

According to a 2018 report from the Intergovernmental Panel on Climate Change, a United Nations body, the cost of a 1.5°C increase in temperatures by 2030 could lead to damage costs of $54tn.

Amazon’s ambitions to address its carbon footprint appear significant. It hopes to power all of its operations with 100pc renewable energy by 2025. By 2030, the aim is to make all Amazon shipments net zero on carbon. Ten years after that, the goal is to be …

2020-06-30 00:00:00 Read the full story…
Weighted Interest Score: 2.6188, Raw Interest Score: 1.4035,
Positive Sentiment: 0.2159, Negative Sentiment 0.1889

The 100 Most Sustainable Companies, Reranked by Social Factors

For years, the three pillars of ESG investing—“E” for environmental factors, “S” for social factors, and “G” for corporate governance—were uneasy bedfellows.

Few investors would argue with the importance of good corporate governance, and most have slowly come to realize that it’s critical for companies to understand any investment risk, or opportunity, that stems from global warming. The S has been easily dismissed as consisting of squishy criteria such as how companies treat their employees, data security, and product safety….

2020-06-28 Read the Full Story…


How Satellite Imagery is Helping Hedge Funds Outperform

At the beginning of the last decade, Swiss investment firm UBS Investment Research began partnering with satellite companies such as Remote Sensing Metrics LLC in order to gauge changes in the occupancy rates of parking lots belonging to Walmart. By taking images of the number of cars entering and leaving the parking lots over certain fixed time periods, it was able to determine the number of customers who were visiting the US mega-retailer; and from this data, an approximation of Walmart’s quarterly sales could be extrapolated. In so doing, UBS became one of the first financial institutions to leverage satellite imagery to gain useful investment insights. “UBS proprietary satellite parking lot fill rate analysis points to an interesting cadence intra-quarter and potential upside to our view,” the subsequent report read.

Satellite imagery falls under the umbrella of alternative data, which represents non-traditional forms of data that are greatly coveted by fund managers eager to gain a competitive informational edge over their peers. Whether it’s counting cars in a retailer’s parking lot as a measure of sales activity, tracking ships across the seas, monitoring crops or scanning the activity at oil rigs, refineries and ports, satellite imagery is proving incredibly useful as a way to measure levels of industrial activity that may not necessarily be possible to determine at ground level.

“It’s not magic. It’s just another input,” noted Matthew Granade, chief market intelligence officer at Point72 Asset Management. “This stuff works best as one input into a much bigger process. On the other hand, it’s getting harder and harder not to have these critical inputs.” Indeed, in the 10 years since UBS’s forays into the nascent field, demand for satellite imagery has skyrocketed such that it is now used extensively across the investment industry. Today, financial institutions—particularly hedge funds—are paying increasingly exorbitant amounts to gain access to information that can reveal crucial insights tied to a potential investment.

2020-06-26 Read the full story…

COVID Notebooks Aims to Speed Predictive Models

IBM’s new open source toolkit with AI extensions to the Jupyter notebooks data science development platform is being extended to a COVID notebooks platform designed to help analyze real-time data about the pandemic. The company’s Center for Open-Source Data and AI Technologies developed the COVID notebooks toolkit that among other things addresses data quality issues related to coronavirus analytics. Along with compiling “authoritative” data on the pandemic, the IBM unit said it “clean[ed] up the most serious data-quality problems.”

“Policy makers are asking questions including: What stories can we tell in the aggregate? Are there patterns we see across the country? What regions or demographics are getting affected the most by the pandemic?” the company said in a blog post. Given that underlying data about the pandemic changes daily, COVID notebooks allows data scientists to concentrate on building models rather than data cleaning. The tool allows frequent updates of results on analysts’ notebooks.

2020-06-25 Read the full story…

Handling Missing Data For Advanced Machine Learning

Throughout this article, you will become good at spotting, understanding, and imputing missing data. We demonstrate various imputation techniques on a real-world logistic regression task using Python. Properly handling missing data has an improving effect on inferences and predictions. This is not to be ignored. The first part of this article presents the framework for understanding missing data. Later we demonstrate the most popular strategies in dealing with missingness on a classification task to predict the onset of diabetes.

MISSING DATA IS HARD TO AVOID : A considerable part of data science or machine learning job is data cleaning. Often when data is collected, there are some missing values appearing in the dataset. To understand the reason why data goes missing, let’s simulate a dataset with two predictors x1, x2, and a response variable y.

2020-06-25 15:02:46+00:00 Read the full story…
Weighted Interest Score: 2.6143, Raw Interest Score: 1.3163,
Positive Sentiment: 0.0658, Negative Sentiment 0.0965

Data privacy rules stop banks from auditing algorithms for bias

Companies say they need access to sensitive data to make sure their systems are being fair

Banks are struggling to audit algorithms for racial bias because of European privacy laws, experts warn.

A privacy clampdown has made collecting the information needed to work out if an automated system has made an unfair decision difficult under General Data Protection Regulation in the UK and Europe, businesses have said. Scientists have long warned that automated systems which make decisions about who to lend money to may be as prejudiced as humans because the data used to train the algorithm may not be diverse enough to be fair.  To mitigate this, institutions need data including race or gender to ensure that the system is not discriminating against these groups.
2020-06-29 00:00:00 Read the full story…
Weighted Interest Score: 2.3513, Raw Interest Score: 1.0381,
Positive Sentiment: 0.2076, Negative Sentiment 0.2076

Refinitiv to Power FxPro’s Data for Real-Time Prices, Corporate Actions, News, ESG

FxPro has adopted Refinitiv’s data and execution management solutions to support its online trading platforms, according to an official announcement. Refinitiv’s global data across multiple asset classes will power FxPro’s capabilities with real-time prices, corporate actions, market-moving Reuters top news, and Environmental, Social, and Governance (ESG) data.

Refinitiv is one of the world’s largest providers of financial markets data and infrastructure, serving over 40,000 institutions in over 190 countries. FxPro is a global Contracts for Difference (CFDs) broker that offers clients trading capabilities on foreign exchange, futures, shares, indices, energies and metals. The partnership will provide FxPro with powerful tools that support their online trading applications with data, analytics, and transactional connectivity.

2020-07-01 09:56:15+00:00 Read the full story…
Weighted Interest Score: 4.8528, Raw Interest Score: 2.5130,
Positive Sentiment: 0.3191, Negative Sentiment 0.0000

Data Science on the Buy Side

What are the main data challenges / pain points for the buy side?

A big challenge is obtaining and retaining data science talent. It is apparent that there is a growing demand, and therefore competition, for data science talent across all industries, not just in financial services. Another challenge relates to the ability to ingest and curate structured and unstructured data rapidly and in a variety of raw formats. The growth in new data providers has led to a wide variance in the quality of data offered by data providers; some providers are well-established and have appropriate data science and technology teams, whereas others can be as limited as two employees in a start-up.

For data to be useful it needs to be clean, consistent and sourced and processed appropriately. Often data is provided after some processing steps are done, which limits awareness of the raw data and can lead to the risk of false representation and predictability.

2020-06-24 11:54:29+00:00 Read the full story…
Weighted Interest Score: 3.7405, Raw Interest Score: 1.8998,
Positive Sentiment: 0.1981, Negative Sentiment 0.0932

Model Evaluation Metrics for Machine Learning

Whenever you build a statistical or Machine Learning model, all the audiences including business stakeholders have only one question, what is model performance? What are model evaluation metrics? What is the accuracy of a model?

Evaluating your developed model helps you refine the model. You keep developing and evaluating your model until you reach an optimum model performance level. (Optimum model performance doesn’t mean 100 percent accuracy; 100 percent accuracy is a myth).

I have seen many analysts and aspiring data scientists who do not give importance to the model performance or model evaluation metrics. You can develop n number of models on one data set, but which model should be picked is the main question. And model evaluation metrics are the answers.

2020-06-27 07:38:06+00:00 Read the full story…
Weighted Interest Score: 3.1709, Raw Interest Score: 1.7172,
Positive Sentiment: 0.1703, Negative Sentiment 0.4186

Google’s G Suite finalizes Connected Sheets and introduces AI-driven data cleanup tools

Last April during its Cloud Next conference, Google unveiled Connected Sheets, a type of Google Sheets spreadsheet that works with the full data set from BigQuery, up to 10 billion rows. After just over a year in preview and beta, Connected Sheets is generally available as of today. And in the coming months, it will be joined by new capabilities — Smart Fill and Smart Cleanup — that leverage AI to learn patterns between columns to autocomplete data and surface suggestions in Sheets’ side panel.

Connected Sheets, along with Smart Fill and Smart Cleanup, are intended to make it easier for G Suite customers to take informed actions and produce better results. According to Gartner, 87% of organizations have low business intelligence and analytics maturity, meaning they’re largely relying on spreadsheet-based management systems while lacking data guidance and support.

“At Google Cloud, we believe everyone — not just those who specialize in writing complex queries — should be able to harness the power of data,” G Suite product manager Ryan Weber wrote in a blog post. “We continue to build Google AI natively into Sheets, so it’s easy for everyone — not just specialized analysts — to quickly make data-backed decisions.”


2020-06-30 00:00:00 Read the full story…
Weighted Interest Score: 3.1066, Raw Interest Score: 1.6561,
Positive Sentiment: 0.1361, Negative Sentiment 0.2042

QuantHouse Adds Machine Learning From Trading System Lab

QuantHouse, the global provider of end-to-end systematic trading solutions including innovative market data services, algo trading platform and infrastructure products and part of Iress, today announced that Trading System Lab® (TSL) has added their machine learning capabilities as part of the QuantFactory cloud backtesting suite.

The QuantFactory cloud backtesting suite provides a fully configurable environment in which clients can develop, backtest, optimise and implement quantitative trading strategies that can later be executed in a standalone, live-trading environment. Machine learning outputs from TSL are integrated into the QuantDeveloper module of QuantFactory.

2020-06-30 10:32:51+00:00 Read the full story…
2020-06-30 13:00:45+00:00 Read the full story…
2020-06-30 00:00:00 Read the full story…
Weighted Interest Score: 7.0464, Raw Interest Score: 2.4194,
Positive Sentiment: 0.3820, Negative Sentiment 0.0424


This news clip post is produced algorithmically based upon CloudQuant’s list of sites and focus items we find interesting. We used natural language processing (NLP) to determine an interest score, and to calculate the sentiment of the linked article using the Loughran and McDonald Sentiment Word Lists.

If you would like to add your blog or website to our search crawler, please email customer_success@cloudquant.com. We welcome all contributors.

This news clip and any CloudQuant comment is for information and illustrative purposes only. It is not, and should not be regarded as investment advice or as a recommendation regarding a course of action. This information is provided with the understanding that CloudQuant is not acting in a fiduciary or advisory capacity under any contract with you, or any applicable law or regulation. You are responsible to make your own independent decision with respect to any course of action based on the content of this post.