Week 1 notes completed
This commit is contained in:
@ -0,0 +1,46 @@
|
||||
# Where does data come from
|
||||
|
||||
* New Data
|
||||
* Pre-existing
|
||||
* Internal 'legacy' data
|
||||
* External data
|
||||
|
||||
## New data
|
||||
* Adding as you go
|
||||
* Bulk data entry
|
||||
|
||||
## Pre-existing data
|
||||
We may need to perform
|
||||
* Extraction
|
||||
* Conversion
|
||||
* Cleaning
|
||||
|
||||
## External sources
|
||||
Possitives
|
||||
* No costs for data entry
|
||||
* No costs for quality checks
|
||||
* Delegate expertise
|
||||
|
||||
Negatives
|
||||
* No control over data quality
|
||||
* No control over data structure
|
||||
* May be incomplete
|
||||
* May be ambiguous
|
||||
* Questions of trustworthiness
|
||||
|
||||
# What does your data look like?
|
||||
Sometimes the external source of information may be ambiguous or incomplete according to our expectations.
|
||||
|
||||
Different interests will shape the content of the data we want to represent.
|
||||
|
||||
# Licenses, sharing and ethics
|
||||
|
||||
## Why would someone let me use their data?
|
||||
* Drive sales (Commercial reasons)
|
||||
* For the common good (ethical reasons)
|
||||
* Contract requirements (contractual reasons, such as government contracts)
|
||||
|
||||
## Why not publish open data?
|
||||
* Restrictions on source data (e.g. medical records)
|
||||
* Control of use
|
||||
* Value of the data, you're in the business of selling data.
|
||||
Reference in New Issue
Block a user