Monday, June 22, 2015

Release of information in machine-readable format

Most businesses like airlines, hotels and restaurants benefit from
making this information available as electronic feed. For some of the
functionalities to come together, one needs data from government
sources. For instance, the ubiquitous global positioning system (gps)
data that so many applications use is a government resource available
for free in an electronic format which computers can crunch.

Decades ago, the U.S. Government set the boat sailing on this
phenomenon when it first made both weather data and the gps freely
available. Since that time, entrepreneurs and innovators have utilised
these resources to create navigation systems, weather newscasts and
warning systems, location-based applications, precision farming tools,
and much more.  One area which has not benefited adequately from this
data revolution is financial services.

I have a blogpost (reproduced below) which talks about recent global
government led push towards open and machine readable data and how the
financial industry in India could benefit from release of information
in machine-readable format.

All data starts out in computers. All data is analysed using
computers. However, all too often, materials are produced and placed
on websites which are not readable by computers. This dramatically
drives up the cost of using the data. There is much to gain from an
insistence that the materials which appear on websites -- of financial
firms and of regulators -- are machine readable.

One example of a success story is mutual funds who provide data like
NAV (Net Asset Value) of their schemes as an electronic feed. Third
party websites are able to use this to provide annualised return on
portfolios, or analysis and comparison of historical data. None of
this would have been possible if mutual funds had a chaos of diverse
presentation with different fund houses giving out data in different

The above-mentioned electronic feed is an example of machine-readable
data. "A computer file" does not constitute machine readable
data. Machine readability is obtained where the data can be read and
processed by a computer for further analysis and interpretation. Comma
Separated Values (CSV) is one example of a machine-readable data
format. Other examples include XML files.

The gains from machine readable data
The value of even minimal information, when made accessible in machine
readable form, is remarkable. As an example, suppose a government
releases adequate information for all consumer courts to be placed on
google maps. Once this is done, consumers can start rating the
courts. This can support policy analysis and improvement of the courts
which are laggards. As more information is released, more
sophisticated applications become possible. If case load data about
consumer courts is made available, third parties could build software
and systems through which one could get a fairly accurate estimate of
the queue and expected hearing slot if a complaint were to be filed on
any given day. If data on financial firms against whom complaints are
filed is also loaded, one would know which firms are generating more

Consider a household survey run by a regulator. The regulator can
release a PDF file with a report which analyses the survey
evidence. This is useful and interesting. A big jump is obtained when
the regulator releases the record level data. This would make possible
novel analysis by third parties, of kinds that may have never been
envisaged by the regulator.

A revolution is shaking the world of finance globally, the financial
technology revolution. This is critically about opening up data access
to new kinds of firms, while access is controlled by consumers. This
is about shifting ownership of data from financial firms or
governments to consumers, and giving consumers access to sophisticated
analytical services which add value.

Developments internationally
These ideas are not unique to India; they are changing the way
governments and regulations work worldwide. The U.S. Government’s Open
Government Directive of 2009 is one early example of a government that
created such an obligation. It said that to the extent practicable and
subject to valid restrictions, agencies should publish information
online in an open format that can be retrieved, downloaded, indexed,
and searched by commonly used web search applications. An open format
was defined as one that is platform independent, machine readable, and
made available to the public without restrictions that would impede
the re-use of that information.

Initially this led to resistance, inconsistent formats etc and
required government to create capacity to make it happen. After that,
the US government has set up, home to its open data
initiative with tools, and resources to conduct research, develop web
and mobile applications and design data visualisations. It followed
this up in 2013 by making open and machine-readable the new default
for government information.

Many countries have embarked on similar initiatives. As an example,
see a paper tabled in 2012 in the UK Parliament about unleashing the
potential of open data. In 2011, the Open Government Platform (OGP)was
launched as an international platform for domestic reformers committed
to making their governments more open, accountable, and responsive to
citizens. Since then, OGP has grown from 8 countries to the 65
participating countries. India is not yet on that list.

Implications for the draft Indian Financial Code

The Indian Financial Code (IFC) has drafted strong reporting
mechanisms so as to achieve accountability of financial sector
regulatory institutions. This needs to be pushed further into the
direction of the release of machine readable data. Good reporting can
be used more effectively, if the data tables and charts can be read
and analysed with minimum frictions through computer programs.  In
Chapter 16, `Functioning of the financial agency', the first section
`Minimum standard for publication of information' (S.74(2)) says:

All information published on the website or other repository of the
Financial Agency must be in an easily accessible and text-searchable

The phrase `machine readable format' needs to be defined and used in
the law. This would encourage innovative financial sector firms and
third parties to provide analysis to consumers using tools like mobile
based apps, thus helping consumers make better choices in a timely