Scraping refers to the process of acquiring data that have already been formatted for some display purpose.
- Usually it refers to data available via the web, but a file is a file is a file.
Cleaning is modifying a data table to
- Correct mistakes such as data-entry blunders & inconsistent coding
- Put data-table values in a format that can easily be read by mainstream, generic software.
- Numerals as numbers, dates & times suitable for sorting and calculation of, e.g., intervals.
- Delete extraneous mark-up, e.g. footnote references in Wikipedia tables