47 lines
1.1 KiB
Markdown
47 lines
1.1 KiB
Markdown
# Where does data come from
|
|
|
|
* New Data
|
|
* Pre-existing
|
|
* Internal 'legacy' data
|
|
* External data
|
|
|
|
## New data
|
|
* Adding as you go
|
|
* Bulk data entry
|
|
|
|
## Pre-existing data
|
|
We may need to perform
|
|
* Extraction
|
|
* Conversion
|
|
* Cleaning
|
|
|
|
## External sources
|
|
Possitives
|
|
* No costs for data entry
|
|
* No costs for quality checks
|
|
* Delegate expertise
|
|
|
|
Negatives
|
|
* No control over data quality
|
|
* No control over data structure
|
|
* May be incomplete
|
|
* May be ambiguous
|
|
* Questions of trustworthiness
|
|
|
|
# What does your data look like?
|
|
Sometimes the external source of information may be ambiguous or incomplete according to our expectations.
|
|
|
|
Different interests will shape the content of the data we want to represent.
|
|
|
|
# Licenses, sharing and ethics
|
|
|
|
## Why would someone let me use their data?
|
|
* Drive sales (Commercial reasons)
|
|
* For the common good (ethical reasons)
|
|
* Contract requirements (contractual reasons, such as government contracts)
|
|
|
|
## Why not publish open data?
|
|
* Restrictions on source data (e.g. medical records)
|
|
* Control of use
|
|
* Value of the data, you're in the business of selling data.
|