HI All,

As you know there is an open comment period on ICANN's Open Data Initiative. This is a great initiative but is only tackling right now a very small subset of ICANN websites. Just what is on the web and not what is on the Wiki. Here is a summary of what has been done so far. The comments are open until July 27. Please send ideas to the list or post them directly on the At Large wiki so that we can help contribute to an At Large response or enter our own comments to the documents.

https://community.icann.org/display/alacpolicydev/At-Large+Workspace%3A+Open+Data+Initiative+Datasets+and+Metadata

Over the last few months a Data Asset Inventory has been created within ICANN org. This lists the preliminary set of datasets that ICANN holds along with associated attributes, such as the system of record and data format. It is the intention that every dataset on this list that can be published as open data, will be published over time. That will be a lengthy and complex process and could take some time to complete. Consequently, we are seeking feedback that will help us prioritize the publication of datasets. The Data Asset Inventory is available in both CSV and PDF formats as detailed in section III.

In addition, they have defined a metadata vocabulary for the metadata that will be published alongside the data and are seeking feedback on this metadata vocabulary. This vocabulary is detailed in section IV below and is available in both CSV and PDF formats as details in section III.

The specific questions we seek feedback on are as follows:

What are your priorities for publication of datasets identified in the data asset inventory?

The next major stage in the Open Data Initiative is the lengthy process of publishing datasets on the upcoming open data platform. This stage could take some time, so it is important that community priorities are taken into account and the highest priority data are released first.

The datasets within ICANN are stored in a variety of formats in a variety of systems of record. In many cases, custom code will need to be developed by ICANN staff to publish data, with any required redaction or aggregation applied. A process which can vary widely in both time and cost. Accordingly, we do not intend to translate community priorities directly into a prioritized list for publication and will instead use a prioritization model that combines ease of publication and community priority.
Are there any errors or omissions in the data asset inventory?

Creating an inventory of datasets is complex process as this is a cutting-edge subject and staff of ICANN org, as with many other organizations, are still learning what makes a dataset. There is therefore the distinct possibility that there may be errors or omissions in the inventory and for that reason we seek feedback on the inventory as provided.

Even if you do not know whether or not ICANN holds a specific dataset but you would like see that dataset published by ICANN then please let us know.
Does the proposed metadata vocabulary meet your needs?

The metadata vocabulary is based on the Project Open Data Metadata Schema v1.1 with minor amendments. We have chosen this standard over other standards such as DCAT, due to its simplicity, greater applicability and ease of processing. This choice does not preclude us later adding additional metadata schemes to our published open data.

Looking forward to a robust discussion of this

Judith

Chair of the Technology Task Force

-- 
_________________________________________________________________________
Judith Hellerstein, Founder & CEO
Hellerstein & Associates
3001 Veazey Terrace NW, Washington DC 20008
Phone: (202) 362-5139  Skype ID: judithhellerstein
Mobile/Whats app: +1202-333-6517
E-mail: Judith@jhellerstein.com   Website: www.jhellerstein.com
Linked In: www.linkedin.com/in/jhellerstein/
Opening Telecom & Technology Opportunities Worldwide