Working as a consulting analyst, I often find myself being brought into projects where at various stages of the process - the best, of course, is to be a part of the pre-process discussions, allowing for conversations on the data, the structure and most importantly, scoping what the deliverable should be.
However, what I’ve often found is the way these projects are stored, particularly for collaborative working, are grossly lacking in proper organisation. Now I won’t sit here and pretend that I am an organisational messiah - I am perhaps one of the worst culprits for disorganised folders. I mean, I could probably do a whole post on the number of random half finished projects I have lying around on my desktop.
So the purpose of this mini-series is simple; to talk about how to organise files, documents and data on a shared workspace. No doubt, there are a number of systems I won’t mention, options I’ll present which aren’t as successful - It’s easier to find something that works for you.
Well, it physically hurts me when I see Desktops which look like this...
I mean, talk about airing your dirty laundry in public! At least shove it all into a “New Folder” and forget about it completely (this is my preferred method of non-organisation)
In reality though - the reason for organising your PC isn’t purely about being OCD. Having a structure of how to organise your files and folders with an actual system will make project management, and even just life management that much easier.
And think of this this way; even just grouping similar files is one step closer to finding something when you go hunting for it. Of course, nowadays there’s rich natural text search, but I think it’s a good habit to at least try to form. Plus, it’s easier to rapid delete a whole folder of trash/useless than having to multi-select.
How does this work? Let’s take an example of a hierarchy;
As you can see, the "root" folder is a generic field, split by genre, by film series, then specific film. Getting into this mindset can be tough, but the way I approach is it a spring clean every now and then to make sure that I'm on top of my organisation.
Think up a naming convention - How do you want to name your data files? How should you name your workbooks? What moniker should raw supporting documents have vs. ones which you've amended? Methods such as adding the date beforehand (ie. 20170321_Project X.tde) or starting to use a "version" suffix (ie. ProjectX_Output v1) - though, a warning from personal experience.. if working in an agile way with continuous iteration, this could become a long, long version list. Using a date structure means that files also remain in an chronological order when you're using them.
An interesting tidbit about spaces in filenames I wasn't aware of (though this may have changed in newer tech) is that using dashes (ie. -) and underscores (ie. _) are better than "spaces" as some software have issue reading spaces. Check out this link for a bit more information.
https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming
Given the tools I use day to day, I’ll mainly dive into both Tableau and Alteryx to explain methods of handover and documentation for those two tools. The methods, however, may bee transferable between softwares.
However, what I’ve often found is the way these projects are stored, particularly for collaborative working, are grossly lacking in proper organisation. Now I won’t sit here and pretend that I am an organisational messiah - I am perhaps one of the worst culprits for disorganised folders. I mean, I could probably do a whole post on the number of random half finished projects I have lying around on my desktop.
So the purpose of this mini-series is simple; to talk about how to organise files, documents and data on a shared workspace. No doubt, there are a number of systems I won’t mention, options I’ll present which aren’t as successful - It’s easier to find something that works for you.
Folders
First in this series, I’m going to talk about folder structure. The most obvious question is this - why would you even both setting up a folder structure?Well, it physically hurts me when I see Desktops which look like this...
I mean, talk about airing your dirty laundry in public! At least shove it all into a “New Folder” and forget about it completely (this is my preferred method of non-organisation)
In reality though - the reason for organising your PC isn’t purely about being OCD. Having a structure of how to organise your files and folders with an actual system will make project management, and even just life management that much easier.
And think of this this way; even just grouping similar files is one step closer to finding something when you go hunting for it. Of course, nowadays there’s rich natural text search, but I think it’s a good habit to at least try to form. Plus, it’s easier to rapid delete a whole folder of trash/useless than having to multi-select.
Stop using the Desktop as a dumping ground.
While it’s stupidly easy to get into the habit of “quickly saving to the Desktop” - mainly cos its the easiest place to drop something and pick it up - it’s worth getting into the habit of cleaning your desktop as part of your ‘log-off’ routine. By this I mean either filing the files on the desktop away into a file, or deleting what’s used. This is hard to do, especially when you’re unsure if you’ll need something in the short term. Personally, I make a “archive” folder on my Desktop which I delete/reorganise when I find some time each week.Figure out a structure, stick to it.
The hardest thing is thinking of a structure. It’ll come naturally - but first, start thinking in hierarchies - what factors go together, which don’t etc.How does this work? Let’s take an example of a hierarchy;
As you can see, the "root" folder is a generic field, split by genre, by film series, then specific film. Getting into this mindset can be tough, but the way I approach is it a spring clean every now and then to make sure that I'm on top of my organisation.
Giving your files random names - Just, no.
I'll touch into this again in later posts, but the crux is this - For spreadsheets, I have countless "Book1.xlsx" or "test.yxdb" or "final_version.tde" etc etc all over my laptop. The issue of course is when you want to organise your documentation, and think about handovers - you have to open each one, rename, check what breaks when you delete it... organisation and having a set "process" helps this, and can set a sustainable precedent for the future.Think up a naming convention - How do you want to name your data files? How should you name your workbooks? What moniker should raw supporting documents have vs. ones which you've amended? Methods such as adding the date beforehand (ie. 20170321_Project X.tde) or starting to use a "version" suffix (ie. ProjectX_Output v1) - though, a warning from personal experience.. if working in an agile way with continuous iteration, this could become a long, long version list. Using a date structure means that files also remain in an chronological order when you're using them.
An interesting tidbit about spaces in filenames I wasn't aware of (though this may have changed in newer tech) is that using dashes (ie. -) and underscores (ie. _) are better than "spaces" as some software have issue reading spaces. Check out this link for a bit more information.
https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming
How does this work with data?
Above is an example of how a folder structure can be built for a project. Let’s talk through each level;
Project: This is the highest level of the hierarchy, which as mentioned should have a clear name which is obvious and easy. If working in a more ”agile” manner, this can be the level where a sprint or v1 is kept.
Project Docs: Any documents which are supplementary to the project, be it presentations, data definitions or similar.
Alteryx Modules: Within this, the user can split as required - having the workflows (.yxmd files) stored in one folder, intermediate data (here meaning .yxdb’s or input files) as well as another folder for macros.
Tableau Outputs: Workbooks and .tde files can be stored here.
Whatever else is required: Project by project, this can change - One which could be added is “Raw Data” as well as ‘inputs’ and ‘outputs’. User discretion comes into play here.
Project: This is the highest level of the hierarchy, which as mentioned should have a clear name which is obvious and easy. If working in a more ”agile” manner, this can be the level where a sprint or v1 is kept.
Project Docs: Any documents which are supplementary to the project, be it presentations, data definitions or similar.
Alteryx Modules: Within this, the user can split as required - having the workflows (.yxmd files) stored in one folder, intermediate data (here meaning .yxdb’s or input files) as well as another folder for macros.
Tableau Outputs: Workbooks and .tde files can be stored here.
Whatever else is required: Project by project, this can change - One which could be added is “Raw Data” as well as ‘inputs’ and ‘outputs’. User discretion comes into play here.
What’s next?
As I alluded to above, ultimately this blog series is going to be about how to document your data processes, and sharing examples of practices (note the lack of "best" before the word) which can be used to ensure that data projects are communicated clearly and in a structured manner.Given the tools I use day to day, I’ll mainly dive into both Tableau and Alteryx to explain methods of handover and documentation for those two tools. The methods, however, may bee transferable between softwares.