Using the Cycle Counter Registers on the Raspberry Pi 3

ARM processors support various performance monitoring registers, the most basic being a cycle count register. This is how to make use of it on the Raspberry Pi 3 with its ARM Cortex-A53 processor. The A53 implements the ARMv8 architecture which can operate in both 64- and 32-bit modes, the Pi 3 uses the 32-bit AArch32 mode, which is more or less backwards compatible with the ARMv7-A architecture, as implemented for example by the Cortex-A7 (used in the early Pi 2’s) and Cortex-A8. I hope I’ve got that right, all these names are confusing

The performance counters are made available through coprocessor registers and the mrc and mcr instructions, the precise registers used depending on the particular architecture.

By default, use of these instructions is only possible in “privileged” mode, ie. from the kernel, so the first thing we need to do is to enable register access from userspace. This can be done through a simple kernel module that can also set up the cycle counter parameters needed (we could do this from userspace after the kernel module has enabled access, but it’s simpler to do everything at once).

To compile a kernel module, you need a set of header files compatible with the kernel you are running. Fortunately, if you have installed a kernel with the raspberrypi-kernel package, the corresponding headers should be in raspberrypi-kernel-headers – if you have used rpi-update, you may need to do something else to get the right headers, and of course if you have built your own kernel, you should use the headers from there. So:

$ sudo apt-get install raspberrypi-kernel
$ sudo apt-get install raspberrypi-kernel-headers

Our Makefile is just:

obj-m += enable_ccr.o

all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

and the kernel module source is:

#include <linux/module.h>
#include <linux/kernel.h>

void enable_ccr(void *info) {
  // Set the User Enable register, bit 0
  asm volatile ("mcr p15, 0, %0, c9, c14, 0" :: "r" (1));
  // Enable all counters in the PNMC control-register
  asm volatile ("MCR p15, 0, %0, c9, c12, 0\t\n" :: "r"(1));
  // Enable cycle counter specifically
  // bit 31: enable cycle counter
  // bits 0-3: enable performance counters 0-3
  asm volatile ("MCR p15, 0, %0, c9, c12, 1\t\n" :: "r"(0x80000000));
}

int init_module(void) {
  // Each cpu has its own set of registers
  on_each_cpu(enable_ccr,NULL,0);
  printk (KERN_INFO "Userspace access to CCR enabled\n");
  return 0;
}

void cleanup_module(void) {
}

To build the module, just use make:

$ make

and if all goes well, the module itself should be built as enable_ccr.ko

Install it:

$ sudo insmod enable_ccr.ko

$ dmesg | tail

should show something like:

...
[ 430.244803] enable_ccr: loading out-of-tree module taints kernel.
[ 430.244820] enable_ccr: module license 'unspecified' taints kernel.
[ 430.244824] Disabling lock debugging due to kernel taint
[ 430.245300] User-level access to CCR has been turned on
...

It should go without saying that making your own kernel modules & allowing normally forbidden access from userspace may result in all sorts of potential vulnerabilities that you should be wary of).

Now we can use the cycle counters in user code:

#include <stdio.h>
#include <stdint.h>

static inline uint32_t ccnt_read (void)
{
  uint32_t cc = 0;
  __asm__ volatile ("mrc p15, 0, %0, c9, c13, 0":"=r" (cc));
  return cc;
}

int main() {
  uint32_t t0 = ccnt_read();
  uint32_t t1 = ccnt_read();       
  printf("%u\n", t1-t0);
  volatile uint64_t n = 100000000;
  while(n > 0) n--;
  t1 = ccnt_read();
  printf("%u\n", t1-t0);
}

We use a volatile loop counter so the loop isn’t optimized away completely.

Using taskset to keep the process on one CPU:

$ gcc -Wall -O3 cycles.c -o cycles
pi@pi3:~/raspbian-ccr$ time taskset 0x1 ./cycles
1
805314304

real 0m0.712s
user 0m0.700s
sys 0m0.010s

Looks like we can count a single cycle and since the Pi 3 has a 1.2GHz clock the loop time looks about right (the clock seems to be scaled if the processor is idle so we don’t necessarily get a full 1.2 billion cycles per second – for example, if we replace the loop above with a sleep).

References:

ARMv8 coprocessor registers:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbjjhaj.html

http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/Bgbdeggf.html

A useful forum discussion (which includes some details on accessing other performance counters):

https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=181490

Cycle counters on the Pi 1:

https://blog.regehr.org/archives/794


10 Comments on “Using the Cycle Counter Registers on the Raspberry Pi 3”

  1. Mr. K says:

    There is a small typo. In your Makefile you address enable-ccr.o (hyphen) while you suggest to name the C file enable_ccr.c (underscore).

  2. David says:

    Help! I copied the above snippets to text editor and saved the files as makefile and enable_ccr.c respectively. I learned that in the makefile, you HAVE to use TAB instead of spacing on the line below all: and clean: respectively from a google search, but now I’m stuck with this after using the make command:

    pi@raspberrypi:~ $ make
    make -C /lib/modules/4.14.79-v7+/build M=/home/pi modules
    make[1]: Entering directory ‘/usr/src/linux-headers-4.14.79-v7+’
    scripts/Makefile.build:45: /home/pi/Makefile: No such file or directory
    make[2]: *** No rule to make target ‘/home/pi/Makefile’. Stop.
    Makefile:1527: recipe for target ‘_module_/home/pi’ failed
    make[1]: *** [_module_/home/pi] Error 2
    make[1]: Leaving directory ‘/usr/src/linux-headers-4.14.79-v7+’
    makefile:4: recipe for target ‘all’ failed
    make: *** [all] Error 2

    • David says:

      Ok, I solved my own problem:)

      first, I needed to save makefile as Makefile (I assumed capitalization wouldn’t matter :\

      then, I found on line 1 of the Makefile, it says …enable-ccr.o… which needs to be changed to …enable_ccr.o

      Once I made those changes, it ran correctly.

  3. matthew says:

    Thanks very much for your comments. Yes, make is fussy about tabs – the original had them, but WordPress seems to have replaced them with spaces, I’ve added a note.

    And thanks for pointing out the typo with enable_ccr, sorry about that. Fixed now.

  4. badger says:

    Hi. It looks like the include files are missing in your code.
    There is definitely something weird going on with your wordpress.

  5. hammaster says:

    I see that you’ve mentioned rpi-update and new headers. What is the kernel version that you’ve been using?

    • matthew says:

      Currently my pi3 is on 4.14.98 (the latest for stretch) – I’m not using rpi-update, I tried it a while back and had some problems getting the right version of kernel headers, but the situation could well be different now, I haven’t been keeping up.


Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s