Skip to main content

Posts

Showing posts with the label public data

Accounting for bias when analyzing public data

We tend to overestimate the reliability of authority figures, and this impacts how we should analyze data for public policy. Public data is an intrinsic appeal to authority The CDC's WONDER database keeps track of causes of death within the United States . When a death certificate is created for a person in the United States, the certificate includes a special code indicating the cause of death. Through a lengthy process, that information makes its way from the funeral home or hospital to a state registry to the National Vital Statistics System and finally to the CDC. CDC tracks that information in WONDER, which can be partially queried by the public. WONDER is used by scientists, researchers and journalists for all sorts of reasons. It was data from WONDER that largely provided the justification for the claim the the United States has been undergoing an epidemic of heroin addiction. And by any measure, the US has a serious problem with heroin and abuse of other opiate drugs. But W

Private Data vs Public Data

Five years ago, someone by the name of Hacker Croll acquired a large amount of sensitive internal corporate documents from Twitter employees . Hacker Croll took 310 of these documents and sent them to the website Techcrunch . Techcrunch decided to use the information, publishing a series of stories based on the documents and the reactions of Twitter and Techcrunch's readers to the release of the documents. The documents themselves were not all that terrible. Twitter, it seems, is not an internet Enron. The release of the documents did not result in any serious consequences for Twitter - no flight of investment, no investigations, no indictments. Techcrunch summarized the contents of the documents as: "executive meeting notes, partner agreements and financial projections to the meal preferences, calendars and phone logs." For a crooked company such documents would be an absolute disaster. But few outside of the Internet and journalism industries noticed what happened.