Last fall I held a internal course at my company with the title Coding with performance and quality in mind. In this post I will write about one part in that course, Cache usage, and I will try to answer the questions why do you have to care?
Some of you might say:
- Hey, you! This is nothing new. If I Google “memory cache performance” I get 55.800.000 hits.”
Why another post?
Well, I will not write another post with lot of details. There are already lot of great post with it already (Google is your friend). I am writing this post for you who are a software developer but don’t know some much about hardware.
The big conclusion
You want the right data inside your cache at the right time!
- That sounds good!
But what is a cache?
In your computer, smartphone or some other device, you have main memory, normally called RAM, and you have cache which is a superfast memory. Some parts of the cache is located in the same chip as the CPU so it located really close to the CPU and makes the access quick (se figure below).
Details: Cache can be divided in L1, L2 and L3 but also TLB, I - Cache and D- Cache (check the links in the end for geek info.
- Now I know what it is, but
How is it used?
Data that the CPU needs now or very soon is read from the main memory (or a cache further away) into the cache so it can be accessed really quickly. It takes longer time if the CPU needs to go all the way over the data bus and copy it every time before the CPU can use it. It might need it again very soon and therefore it is temporary stored in the cache.
But there are limitations! The cache is limited in size. For instance my Samsung Galaxy SIII has one L1 cache for data usage one for each core. Another limitation is that it can only copy chunks of data, cache lines, which typically has a size of 32, 64 and 128 bytes.
- So whats the problem? Just copy what you need!
Before the CPU starts copy data from the main memory to the cache, it first checks if it’s already exists in the cache. If it doesn’t, it is called cache miss and the CPU needs to work more, which you don’t wan’t.
- Ok, I understand what you mean. It’s good to have some understanding what limitations the hardware have, but
How do I minimize cache misses?
You organize your data after usage! Imagine you have a data struct ,as the one below, and you what to search for some persons id. The size of of one cache line is limited to 32 bytes.
struct person
{
unsigned int id;
char data[40];
struct person *pNext;
}
...
while (ptr->id != magicId)
{
ptr = ptr->pNext;
}
When the CPU copy one line, it will contain the id and a part of the first person due the the size of the cache line. If the first persons id doesn’t match, it will continue to search by updating the ptr pointer. The problem is that this information is not inside the cache. The CPU needs to copy that information from the main memory into the cache. This is a typically cache miss. A better solution would be to move the most common used data close to each other as below
struct person
{
unsigned int id;
struct person *pNext;
char data[40];
}
- So you are saying that I shall re-organize all my data structs now?
No, first you analyse your code in a profiler and find your hot spots. Cache misses might not be your biggest problem but it’s good to understand if one of your loops turns to to be a hot spot.
I am a big fan of Static Code Analysis (SCA) because I learn to produce better code!
I have worked more than 10 years as a professional embedded software developer in several different environments and with different platforms and languages, and I believe I can write decent code.
But you know what, I can’t!!
I have not had a single week during my life that I have delivered perfect code with no errors. The weired part is that I even teach my colleagues how to write better code and write this blog about this subject. So why is it so hard to avoid those tiny, tiny errors. Well because we are humans and not machines. Yeap!! You read right.
If I were a machine it would be possible for me to think on every thing that I need to think on. Psychologist says that the human mind can hold 7 +/- 2 things at the time. If we try to remember more we starting to forget. I believe that you can apply that on writing code. Each software developer focus on different things depending on her/his previous experiences. For me, I usually focus on these things:
- have I initialized parameters?
- can I refactor the code?
- have I covered the new code with unit tests?
- can someone else understand it?
- is this code from a performance perspective (like cache and memory usage)
that was five. Maybe I missed something, but you I am human!
Normally after I have written some code and run my SCA and it founds a error or warning, quite often I say to myself. ahhhh I missed that. Sloppy of me! But sometimes it indicates something new and in my mind goes:
Stop! This one is new! Lets do som digging.
What I don't do is fix i right away and rerun the SCA.. First I make sure I understand the problem and then I fix it. Through that I always learn something new and maybe I remember seven things to check for instead of 5 next time.
Finally I did it! I had my first bigger/higher retrospective. It wasn’t perfect, but I learned a lot.
In April last year I wrote a post on the internal forum at my company about Retrospective on a higher level, where I discussed that we missed some form of finalizing, round-up of each project. A planned time were all developers, testers, architecture's and manager sit together a put there view on how they thought it went. A possibility to visualize choices that resulted in good and bad, from which we can learn from.
Why bigger?
Running retrospectives after each sprint is a great way to reflect after small changes. Maybe your team has tried pair programming or TDD for a sprint and then its perfect during the sprint retrospective to reflect over what everyone thought about it.
But during a project for several months or years it happens a liitle bit more:
- Several teams are working with different parts and the code changes a lot.
- There will be lot of communication ans synchronization between the teams and managers.
- New way of working like remote-pair-programming.
- New tools are introduced .
- People leave the company and new people start.
This list can be made long. My point is that some changes a three week sprint is not enough to make reflection from. You need more input data to spot the delta.
My frirst trial
After I have read Joakim Sunden’s post, Running big retrospective at Spotify I finally moved from just talking about it to actually doing it. And it was great!
What I did was:
- I invited all how has been involved during a 6 month period.
- Booked a conference room for a hole afternoon.
- Prepared me well with agenda, post it notes, etc.
- No computer!
- Happy mode curve.
As I mentioned in the beginning the result wasn’t perfect, but it was ok. Unfortunately only developers attended, which only give one side of the project. What I missed was that several invited was invited as optional and not required. Yes, some people actually take notice of this.
The good part was I really believe everyone who was there really enjoyed it. One important part, that I also put extra care to, was that everyone should participate. It’s very common that there are only one or two who speaks all the time (some smart people sit silent in the back and don’t care about agile and processes). By letting everyone write down what they did during the gathering data phase, and put them on a timeline on the whiteboard afterwords, everyone got a chance to share there thoughts and feelings.
Another good factor was the happy mode curve. Everyone was asked to draw a “sinus curve” describing what they felt for the task during the hole project. Is was a clear correlation between the curve and “how well” the project went.
Another the last success factor was to not bring any computer. We did it truly kindergarten way. We used pencil and paper.
I will do it again!