7

As far as my understanding goes, User space programs run in the unprivileged mode, and thus do not have direct access to memory or I/O.

Then how exactly can we directly access memory or I/O locations when we mmap /dev/mem in user space programs?

For example:

int fd = 0;
u8 leds = 0;
fd = open("/dev/mem", O_RDWR|O_SYNC);
leds = (u8 *)mmap(0, getpagesize(), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0x80840000);

This is a hack very commonly used in embedded devices.

Now the variable leds can be used on the fly to access any device that could be present at 0x80840000.

We won't be using any system call to access that address anymore.

Even something like

leds[0x20] = val;

would work.

But privileged operations, such as reading/writing directly to/from an I/O address should be possible only by putting the processor to privileged mode through a system call.

Source.

Stark07
  • 552
  • 2
  • 8
  • 20
  • Show an example where that is happening as an unprivileged user? – wurtel Nov 14 '14 at 12:09
  • 2
    I think you're missing a `*` in the declaration of `leds`, but that's just code, no evidence of this actually working as an unprivileged user; in my (limited) experience, everything runs as root on embedded devices. – wurtel Nov 14 '14 at 14:54
  • @wurtel - my bad, might've copy pasted poorly... Yes, everything does work as root, but my question goes down to the very heart of the OS. So does this mean that if you are root, you get just as much privileges as the kernel itself? – Stark07 Nov 16 '14 at 10:41
  • The `root` user has many privileges, which include being allowed to override traditional filesystem permissions. Some "files" are really access to devices (like disk or memory), which are off-limits for regular users. Not `root`. But `root` runs in *userspace*, so it doesn't enjoy the full privileges of *system mode* (execute privileged instructions, mostly). – vonbrand Feb 28 '16 at 23:26

3 Answers3

9

Allowing access to /dev/mem by unprivileged processes would indeed be a security problem and should not be permitted.

On my system, ls -l /dev/mem looks like this:

crw-r----- 1 root kmem 1, 1 Sep  8 10:12 /dev/mem

So root can read and write it, members of the kmem group (of which there happen to be none) can read it but not write it, and everyone else cannot open it at all. So this should be secure.

If your /dev/mem is anything like mine, your unprivileged process should not even have been able to open the file at all, let alone mmap it.

Check the permissions of /dev/mem on your system to make sure they are secure!

Celada
  • 43,173
  • 5
  • 96
  • 105
  • But assuming that I *am* doing this as root... Does it mean that a process run by root gets kernel level access privileges? – Stark07 Nov 16 '14 at 10:41
  • 1
    Well, if you're doing this as root then you have access to everything. I'm not sure what you mean by "kernel level access privileges", but root can certainly do anything it wants to, one way or another (in particular, for example, by crafting and dynamically loading a new kernel module). – Celada Nov 17 '14 at 00:37
  • By privileges I mean the 4 ring privilege system used in intel processors for example. I am not talking about user level access privileges..... – Stark07 Nov 17 '14 at 05:42
  • I understand. Yes indeed, all user space code, including root's, runs at a lower CPU privilege level than kernel code. But that doesn't have anything to do with who can and can't access `/dev/mem`. – Celada Nov 17 '14 at 05:50
  • That is the section I can't quite get my head around. It's not about who can and can't access /dev/mem. I know only root can do that. My concern is only regarding the bare level transaction that's happening. – Stark07 Nov 17 '14 at 06:20
  • /dev/mem is just an abstraction to be able to map the actual RAM as a memory space to your process. But when I read/write to a pointer that has been mmap'ed to /dev/mem, that user space process is actually sending direct I/O instructions to that particular address range without the use of any system calls. – Stark07 Nov 17 '14 at 06:47
  • 1
    Yes, that's correct: when you map a file, you afterwards get to read & write the file without the use of any ststem calls. That's true whether you've mapped a regular disk file or a special device like `/dev/mem` . If you prefer, you could open `/dev/mem` and issue `read()` and `write()` systems calls. Then you would not be sending I/O without the use of any system calls. The end result is the same, but if you have lots of small I/O to do, `mmap()` and direct access will probably perform better specifically because you don't need system calls! This is true for regular files too! – Celada Nov 17 '14 at 07:20
  • 1
    @Stark07 I think you've missed the point. The `mmap` (and `/dev/mem`) themselves does **NOT** provide *direct* access to the system memory, and it is impossible to do so in Ring 4. What it does instead is just mapping a file (or resource) to a specific virtual address in the calling process, just like what happens when a program is loaded/`exec`ed (in this case, the program image is `mmap`ed to the virtual memory). – minmaxavg Apr 08 '16 at 02:09
  • 2
    TL;DR An unprivileged process does *not* have direct access to system resource, but **can** do so when those resource is mapped to their virtual address space. There's no difference between accessing its stack/heap/rodata/whatsoever and accessing kernel memory - They eventually both access the actual physical memory, while the latter case just happens to be under the program's control. – minmaxavg Apr 08 '16 at 02:13
  • I meant Ring 3.... – minmaxavg Apr 08 '16 at 02:18
  • @Stark07: This *just* grants the user-space process the ability to read/write the region of memory that the kernel lets it `mmap` (see also `CONFIG_STRICT_DEVMEM`). It still can't execute privileged instructions like [`lidt`](http://felixcloutier.com/x86/LGDT:LIDT.html) to set a new Interrupt-handler Table, or [`invd`](http://felixcloutier.com/x86/INVD.html) to invalidate all caches *without* writing dirty data back to memory first. Note the "Protected Mode Exceptions" section: "#GP(0) If the current privilege level is not 0." – Peter Cordes Sep 01 '17 at 21:29
7

The addresses visible to a user process (whether running as root or an unpriviledged user) are virtual addresses, which get mapped to physical addresses by the MMU through the page tables. Setting up the page tables is a priviledged operation, and can only be performed by kernel code; however, once the page tables are set, accessing the memory is allowed in user mode.

Concretely, your code uses mmap to request that the kernel set up the page tables to allow access to a given range of physical memory. The kernel checks the process's priviledges (it has read/write access to /dev/mem) and sets up the page tables to allow it to access physical memory.

jch
  • 171
  • 1
  • 3
2

The value of leds is a virtual address. As long as it is in the user-space of the current process, the process can access it directly via instructions like leds[0] = val without having to be in privileged mode, no matter where in the RAM this virtual address is mapped to

xiaokaoy
  • 121
  • 4