Raw data — This is the data as it comes in. In this state there could be three potential problems:
Technically correct data — Once the raw data is modified to get rid of the above listed discrepancies, it is said to be "technically correct data."
Consistent data — In this stage, data is ready to be exposed to any sort of statistical analysis, and can be used as a starting point for analysis.
Statistical results and output — After getting statistical results, they can be stored for reuse. These results can also be formatted so that they can be used for publishing various kinds of reports.
Visual Representation of Data
Representing the data in a well-structured format which is readable and understandable to the audience is vitally important. Handling the unstructured data and then representing it in a visual format can be a challenging job which organizations implementing big data are going to face in the near future. To cater to this need, different types of graphs or tables can be used to represent the data.