Privilege Escalation Attack - Linux (CVE-2017-0358)

Nov 2017

This article is a deep dive on CVE-2017-0358. The article starts with an introduction to the attack and then explains the basics of the components present in the attack and finally details the vulnerability and exploit implementation.

Introduction

CVE-2017-0358 is a privilege escalation attack on Linux reported by Jann Horn of Google Project Zero. Jann Horn exploited a design pitfall found in NTFS-3G to escalate the privilege. NTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. It comes pre-installed in most of the Linux distros. The package uses FUSE (Filesystem in Userspace) to mount NTFS partitions. If the FUSE module is not already supported by the kernel, NTFS-3G attempts to load fuse using modprobe. This leads to a security pitfall as NTFS-3G launches modprobe with an effective uid of 0, even though the modprobe is not designed to run in a setuid context. This design pitfall is exploited in CVE-2017-0358.

Components

NTFS-3G

NTFS-3G is a read-write NTFS driver for Linux distros (A module that will help you read and write to windows files using Linux). It provides support for NTFS file systems of all the windows versions starting from Windows XP to Windows 10. The package is widely used and is available for embedded systems as well.

If your computer dual boots to Windows and Linux, it is most probable that you have handled Windows files using Linux, and if so, you have already used NTFS-3G!

Linux kernel module

Linux Kernel modules are pieces of code that can be loaded and unloaded into the kernel upon demand. They extend the functionality of the kernel without the need to reboot the system. For example, when you plug in a mouse to your computer, the corresponding device driver needs to be loaded into the kernel. This device driver can be considered as a kernel module. It loads into the kernel without the need of any restart and extends the functionality of the system to make use of the device plugged in.

A custom kernel module is recommended to have 3 important functions in it.

  1. Module init (__init): The function that should be called while loading the module. The return value of this function is considered to be a flag. If the module loading is a failure, the function is expected to return an error to modprobe.
  2. Module exit (__exit): The function that will be called while unloading the module from kernel. This function is supposed to clean the kernel by unsetting all settings that were made for the module.
  3. Module param: A function that lets you pass parameters to the module.

In Linux, insmod, rmmod and modprobe help to load or unload these modules to the kernel.

Modprobe

modprobe is a Linux program that intelligently adds or removes a module from the kernel. In fact, it internally calls insmod and rmmod to complete the tasks. modprobe is called intelligent as it detects the dependencies of the module and adds them automatically.

FUSE

FUSE or Filesystem in Userspace is a Linux software interface that allows the user to write their own file systems without editing the kernel code.

For example, let's say you like to have a filesystem were any ".csv" file that you save will be automatically converted to a spreadsheet and any ".doc" file stored will be converted to pdf etc… This can be realized by writing a custom filesystem using FUSE. In fact, NTFS-3G is built upon FUSE.

Vulnerability

When NTFS-3G is invoked to mount an NTFS partition, it seeks the help of FUSE module to do the same. Firstly, it reads /proc/filesystems I to see if the FUSE module is already loaded or not. In case it fails to open /proc/filesystems, it assumes that FUSE is not supported currently. In this case, it invokes modprobe with root privileges to load the FUSE module to the kernel. To give root privileges, NTFS-3G sets uid of modprobe to 0. The issue is that modprobe is not designed to run in setuid context as it accepts arguments from environment variables. By setting crafted environment variables, we trick modprobe to load our malicious module instead of the actual FUSE module. Thus loading malicious code into the kernel.

Exploit Mechanism

Instead of loading a malicious module into the kernel, in this example , we obtain a binary which can create rootshell for us, whenever needed. The procedure of exploit is as follows:

1. Create a malicious kernel module

First, we create a binary (say rootshell) to execute /bin/bash command, so that once our binary gets root privilege, we can get a bash shell with root privileges.

Now, create a malicious module that will be loaded into the kernel. The malicious module will have an __init function which will return a “fake error” after setting the following permissions on our rootshell binary:

  1. uid=0, gid=0
  2. read & execute permissions for everyone

2. Settings the stage for modprobe

We set the following environment variable to mislead modprobe.

MODPROS="-C malicious_config_dir -d malicious_module_dir"

The variable has two flags in it:

  1. -C: This sets the configuration directory for the modprobe . We set it's value to point to our malicious configuration folder.
  2. -d: This sets the directory to load the module from. We set it to the directory containing our malicious module.

3. Modprobe configuration

In the malicious configuration file (step 2.1), we set two configurations. We set fuse as an alias to rootmod (our malicious module's name) and pass the file descriptor of rootshell (reason for passing this file descriptor will be explained later) as an argument to our module. Thus our config file will look like this:

alias fuse rootmod
options rootmod suidfile_fd=suidfile_fd

Now when the modprobe is called, it will go and check for fuse module in the directory we set in step 2.2. Since the directory contains only our malicious module (rootmod), modprobe can't load fuse. But in the config file, we have set fuse as an alias name to rootmod, hence modporbe believes that fuse is same as rootmod, and loads rootmod.

4. Deny access to /proc/filesystems

In Linux, the number of files that can be opened at a time is limited by the number set in /proc/sys/fs/file-max. i.e. the maximum number of file descriptors at a time cannot be greater than this limit. We will keep opening more and more files till the limit is reached. After the number of file descriptors reaches its limit, any attempt by NTFS-3G to read /proc/filesystems will fail. (In the example, instead of explicitly opening files, we repeatedly create events, which creates and returns new file descriptors for us.)

Note: We will open all the essential files required for the script beforehand so that we don't run out of file descriptors. This is the reason why we passed the file descriptor for rootshell as an argument in step 3.

Note: This has some practical difficulties, they are discussed later.

5. Call to NTFS-3G

Let's list what we have done till now:

  1. Created the malicious kernel module.
    Its __init will set the required permissions for our rootshell binary.
  2. Created config file for modprobe.
    The configurations will ask modprobe to load rootmod instead of fuse.
  3. Set environmental variable.
    This will help us load config file and source directory to modprobe when it is called.
  4. We exhausted all file descriptors in the system.
    So that any attempt to read /proc/filesystems will be a failure.

At this point, we call NTFS-3G asking it to mount a dummy module. NTFS-3G will try to read /proc/filesystems to determine if the system supports fuse module. Since no more file descriptors can be created, the read fails. In this case, NTFS-3G assumes the worst. Assuming that the fuse is not supported by the kernel, it calls modprobe to load the fuse module.

As mentioned in 5.3, environmental variable set will now provide malicious configurations to modprobe, which will mislead modprobe to load our malicious config file and directory into the kernel.

Some practical difficulties

There are some practical difficulties in exhausting the file descriptors.

In Linux, the number of files openable by a process is limited and this limit is below the total number of file descriptors allowed. As a work around, we will have to fork the process which is opening the files multiple times so that the total number of files opened by all forked processes exceed the limit.

Before trying to mount any files, NTFS-3G will check if the files are already mounted or not. For this, NTFS-3G will read /proc/mounts (this file lists all the filesystems that are already mounted). If we exhaust the file descriptors before calling NTFS-3G, this read will fail, and NTFS-3G will not continue any further. So we need to somehow exhaust all the file descriptors after NTFS-3G reads /proc/mounts and before it attempts to read /proc/filesystems. To achieve this, we will create a notification for /proc/mounts. So that, as soon as some process reads /proc/mounts, we will be notified. As soon as we are notified of this event, we will send a SIGSTOP signal to NTFS-3G's process (Note: SIGSTOP doesn't kill the process, but pauses it). Now we will exhaust the file descriptors and then send a SIGCONT signal to NTFS-3G (This signal asks the process to continue). NTFS-3G, when continued will not be able to read /proc/filesystems as we have exhausted the file descriptors.

Proposed solution

Both modprobe, as well as NTFS-3G, can be blamed for the design flaw. In modprobe's side, accepting the environment variables as parameters is a little insecure. But the authors of modprobe assume that the user/process setting the environment variables is also the one calling modprobe. They have a valid assumption that the users will take enough countermeasures like not running it in setuid context. Since modprobe is not meant to be used the way it is used by NTFS-3G, the solution is suggested in NTFS-3G's design. The solution is that, NTFS-3G should clear the environment variable MODPROBE_OPTIONS before modprobe is called.


The end
Other Articles