This is an online tool that can be used to reformat, clean up and organise your dataset.
Once you have gathered your data and entered it into a spreadsheet you are left with the laborious, yet essential, task of cleaning it up. A messy dataset does not only hinder you from undertaking a meaningful analysis, but it can also prevent you from being able to use visualisation tools or statistical packages as many require specific formatting. Most datasets need some sort of 'cleaning up' to improve their quality; issues such as misspellings, missing data and duplicated entries need be identified and fixed.
Rather than the traditional method of manual-editing, Data Wrangler offers 14 different options for formatting your dataset and within each of these options are a variety of choices. These functions let you split, merge, delete, promote, wrap, fold and unfold your rows or columns and also to cut, extract and edit your data. This variety of choices may seem overwhelming, but the tool helpfully suggests possible changes in rank-order and also offers visual previews so you can try the different options out before accepting them.
Data Wrangler enables you to reformat your data values or the layout of the dataset, amend incorrect or missing values and split entries into separate columns or rows. For example, if you were to choose the 'unfold' option it would allow you to identify different value types in one column and separate them out into new columns. Another example is the 'fill' option that identifies empty cells in your dataset and copies and pastes information contained in the preceding non-empty cells into these empty cells. This is a helpful function when dealing with visualisation tools as many will not work if a dataset contains empty cells, as they do not recognise them. It is often difficult to discover data quality issues and usefully Data Wrangler highlights errors in red and indicates whether a column has inconsistent data types and flags up the individual potential errors.
it leaves you with more time to analyse and visualise your data rather than potentially wasting hours rearranging your dataset into a useable format.
its privacy settings as the Stanford Visualization Group is logging the different steps taken by the user and also some specific details about your dataset (see more in Privacy and Portability).
It can only be used online as it is a browser-based web application
Data Wrangler is free and used online through your internet browser. Like Open Refine this is not a spreadsheet application and should be used after the data has been collected and logged. Copy the data from a CSV file and paste it into Data Wrangler and click ‘Wrangle'. You should now see your data displayed on screen and to the left-hand side there is the interactive history bar and changes editor.
CSV, JSON and TSV
Data Wrangler is a web-based service that needs continuous broadband internet connectivity in order to work. To assess how the tool is being used Stanford Visualization Group is logging the different steps that users are taking and also some specific details about your dataset. On their website they state, “they do not transmit or store your full pasted data set” however there is no evidence to this effect.
Stanford Visualization Group