On Wed, Aug 07, 2013 at 08:04:13PM -0700, Sudeep Dutt wrote: > From: Caz Yokoyama <Caz.Yokoyama@xxxxxxxxx> > > This patch introduces a sample user space daemon which > implements the virtio device backends on the host. The daemon > creates/removes/configures virtio device backends by communicating with > the Intel MIC Host Driver. The virtio devices currently supported are > virtio net, virtio console and virtio block. Virtio net supports TSO/GSO. > The daemon also monitors card shutdown status and takes appropriate actions > like killing the virtio backends and resetting the card upon card shutdown > and crashes. > > Co-author: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Co-author: Sudeep Dutt <sudeep.dutt@xxxxxxxxx> > Signed-off-by: Ashutosh Dixit <ashutosh.dixit@xxxxxxxxx> > Signed-off-by: Caz Yokoyama <Caz.Yokoyama@xxxxxxxxx> > Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@xxxxxxxxx> > Signed-off-by: Nikhil Rao <nikhil.rao@xxxxxxxxx> > Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@xxxxxxxxx> > Signed-off-by: Sudeep Dutt <sudeep.dutt@xxxxxxxxx> > Acked-by: Yaozu (Eddie) Dong <eddie.dong@xxxxxxxxx> > --- > Documentation/mic/mic_overview.txt | 48 + > Documentation/mic/mpssd/.gitignore | 1 + > Documentation/mic/mpssd/Makefile | 19 + > Documentation/mic/mpssd/micctrl | 152 ++++ > Documentation/mic/mpssd/mpss | 245 ++++++ > Documentation/mic/mpssd/mpssd.c | 1689 ++++++++++++++++++++++++++++++++++++ > Documentation/mic/mpssd/mpssd.h | 100 +++ > Documentation/mic/mpssd/sysfs.c | 103 +++ Is this generally useful or just example code? If the former, you can put it in tools/ as well. > 8 files changed, 2357 insertions(+) > create mode 100644 Documentation/mic/mic_overview.txt > create mode 100644 Documentation/mic/mpssd/.gitignore > create mode 100644 Documentation/mic/mpssd/Makefile > create mode 100755 Documentation/mic/mpssd/micctrl > create mode 100755 Documentation/mic/mpssd/mpss > create mode 100644 Documentation/mic/mpssd/mpssd.c > create mode 100644 Documentation/mic/mpssd/mpssd.h > create mode 100644 Documentation/mic/mpssd/sysfs.c > > diff --git a/Documentation/mic/mic_overview.txt b/Documentation/mic/mic_overview.txt > new file mode 100644 > index 0000000..8b1a916 > --- /dev/null > +++ b/Documentation/mic/mic_overview.txt > @@ -0,0 +1,48 @@ > +An Intel MIC X100 device is a PCIe form factor add-in coprocessor > +card based on the Intel Many Integrated Core (MIC) architecture > +that runs a Linux OS. It is a PCIe endpoint in a platform and therefore > +implements the three required standard address spaces i.e. configuration, > +memory and I/O. The host OS loads a device driver as is typical for > +PCIe devices. The card itself runs a bootstrap after reset that > +transfers control to the card OS downloaded from the host driver. > +The card OS as shipped by Intel is a Linux kernel with modifications > +for the X100 devices. > + > +Since it is a PCIe card, it does not have the ability to host hardware > +devices for networking, storage and console. We provide these devices > +on X100 coprocessors thus enabling a self-bootable equivalent environment > +for applications. A key benefit of our solution is that it leverages > +the standard virtio framework for network, disk and console devices, > +though in our case the virtio framework is used across a PCIe bus. > + > +Here is a block diagram of the various components described above. The > +virtio backends are situated on the host rather than the card given better > +single threaded performance for the host compared to MIC and the ability of > +the host to initiate DMA's to/from the card using the MIC DMA engine. > + > + | > + +----------+ | +----------+ > + | Card OS | | | Host OS | > + +----------+ | +----------+ > + | > ++-------+ +--------+ +------+ | +---------+ +--------+ +--------+ > +| Virtio| |Virtio | |Virtio| | |Virtio | |Virtio | |Virtio | > +| Net | |Console | |Block | | |Net | |Console | |Block | > +| Driver| |Driver | |Driver| | |backend | |backend | |backend | > ++-------+ +--------+ +------+ | +---------+ +--------+ +--------+ > + | | | | | | | > + | | | |Ring 3| | | > + | | | |------|------------|---------|------- > + +-------------------+ |Ring 0+--------------------------+ > + | | | Virtio over PCIe IOCTLs | > + | | +--------------------------+ > + +--------------+ | | > + |Intel MIC | | +---------------+ > + |Card Driver | | |Intel MIC | > + +--------------+ | |Host Driver | > + | | +---------------+ > + | | | > + +-------------------------------------------------------------+ > + | | > + | PCIe Bus | > + +-------------------------------------------------------------+ > diff --git a/Documentation/mic/mpssd/.gitignore b/Documentation/mic/mpssd/.gitignore > new file mode 100644 > index 0000000..8b7c72f > --- /dev/null > +++ b/Documentation/mic/mpssd/.gitignore > @@ -0,0 +1 @@ > +mpssd > diff --git a/Documentation/mic/mpssd/Makefile b/Documentation/mic/mpssd/Makefile > new file mode 100644 > index 0000000..eb860a7 > --- /dev/null > +++ b/Documentation/mic/mpssd/Makefile > @@ -0,0 +1,19 @@ > +# > +# Makefile - Intel MIC User Space Tools. > +# Copyright(c) 2013, Intel Corporation. > +# > +ifdef DEBUG > +CFLAGS += $(USERWARNFLAGS) -I. -g -Wall -DDEBUG=$(DEBUG) > +else > +CFLAGS += $(USERWARNFLAGS) -I. -g -Wall > +endif > + > +mpssd: mpssd.o sysfs.o > + $(CC) $(CFLAGS) -o $@ $^ -lpthread > + > +install: > + install mpssd /usr/sbin/mpssd > + install micctrl /usr/sbin/micctrl > + > +clean: > + rm -f mpssd *.o > diff --git a/Documentation/mic/mpssd/micctrl b/Documentation/mic/mpssd/micctrl > new file mode 100755 > index 0000000..e0cfa53 > --- /dev/null > +++ b/Documentation/mic/mpssd/micctrl > @@ -0,0 +1,152 @@ > +#!/bin/bash > +# Intel MIC Platform Software Stack (MPSS) > +# > +# Copyright(c) 2013 Intel Corporation. > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License, version 2, as > +# published by the Free Software Foundation. > +# > +# This program is distributed in the hope that it will be useful, but > +# WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# General Public License for more details. > +# > +# The full GNU General Public License is included in this distribution in > +# the file called "COPYING". > +# > +# Intel MIC User Space Tools. > +# > +# micctrl - Controls MIC boot/start/stop. > +# > +# chkconfig: 2345 95 05 > +# description: start MPSS stack processing. > +# > +### BEGIN INIT INFO > +# Provides: micctrl > +### END INIT INFO > + > +# Source function library. > +. /etc/init.d/functions > + > +sysfs="/sys/class/mic" > + > +status() > +{ > + if [ "`echo $1 | head -c3`" == "mic" ]; then > + f=$sysfs/$1 > + echo -e $1 state: "`cat $f/state`" shutdown_status: "`cat $f/shutdown_status`" > + return 0 > + fi > + > + if [ -d "$sysfs" ]; then > + for f in $sysfs/* > + do > + echo -e ""`basename $f`" state: "`cat $f/state`" shutdown_status: "`cat $f/shutdown_status`"" > + done > + fi > + > + return 0 > +} > + > +reset() > +{ > + if [ "`echo $1 | head -c3`" == "mic" ]; then > + f=$sysfs/$1 > + echo reset > $f/state > + return 0 > + fi > + > + if [ -d "$sysfs" ]; then > + for f in $sysfs/* > + do > + echo reset > $f/state > + done > + fi > + > + return 0 > +} > + > +boot() > +{ > + if [ "`echo $1 | head -c3`" == "mic" ]; then > + f=$sysfs/$1 > + echo "boot:linux:mic/uos.img:mic/$1.image" > $f/state > + return 0 > + fi > + > + if [ -d "$sysfs" ]; then > + for f in $sysfs/* > + do > + echo "boot:linux:mic/uos.img:mic/`basename $f`.image" > $f/state > + done > + fi > + > + return 0 > +} > + > +shutdown() > +{ > + if [ "`echo $1 | head -c3`" == "mic" ]; then > + f=$sysfs/$1 > + echo shutdown > $f/state > + return 0 > + fi > + > + if [ -d "$sysfs" ]; then > + for f in $sysfs/* > + do > + echo shutdown > $f/state > + done > + fi > + > + return 0 > +} > + > +wait() > +{ > + if [ "`echo $1 | head -c3`" == "mic" ]; then > + f=$sysfs/$1 > + while [ "`cat $f/state`" != "offline" -a "`cat $f/state`" != "online" ] > + do > + sleep 1 > + echo -e "Waiting for $1 to go offline" > + done > + return 0 > + fi > + > + if [ -d "$sysfs" ]; then > + # Wait for the cards to go offline > + for f in $sysfs/* > + do > + while [ "`cat $f/state`" != "offline" -a "`cat $f/state`" != "online" ] > + do > + sleep 1 > + echo -e "Waiting for "`basename $f`" to go offline" > + done > + done > + fi > +} > + > +case $1 in > + -s) > + status $2 > + ;; > + -r) > + reset $2 > + ;; > + -b) > + boot $2 > + ;; > + -S) > + shutdown $2 > + ;; > + -w) > + wait $2 > + ;; > + *) > + echo $"Usage: $0 {-s (status) |-r (reset) |-b (boot) |-S (shutdown) |-w (wait)}" > + exit 2 > +esac > + > +exit $? > diff --git a/Documentation/mic/mpssd/mpss b/Documentation/mic/mpssd/mpss > new file mode 100755 > index 0000000..f0bb3dd > --- /dev/null > +++ b/Documentation/mic/mpssd/mpss > @@ -0,0 +1,245 @@ > +#!/bin/bash > +# Intel MIC Platform Software Stack (MPSS) > +# > +# Copyright(c) 2013 Intel Corporation. > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License, version 2, as > +# published by the Free Software Foundation. > +# > +# This program is distributed in the hope that it will be useful, but > +# WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# General Public License for more details. > +# > +# The full GNU General Public License is included in this distribution in > +# the file called "COPYING". > +# > +# Intel MIC User Space Tools. > +# > +# mpss Start mpssd. > +# > +# chkconfig: 2345 95 05 > +# description: start MPSS stack processing. > +# > +### BEGIN INIT INFO > +# Provides: mpss > +# Required-Start: > +# Required-Stop: > +# Short-Description: MPSS stack control > +# Description: MPSS stack control > +### END INIT INFO > + > +# Source function library. > +. /etc/init.d/functions > + > +exec=/usr/sbin/mpssd > +sysfs="/sys/class/mic" > + > +start() > +{ > + [ -x $exec ] || exit 5 > + > + echo -e $"Starting MPSS Stack" > + > + echo -e $"Loading MIC_HOST Module" > + > + # Ensure the driver is loaded > + [ -d "$sysfs" ] || modprobe mic_host > + > + if [ "`ps -e | awk '{print $4}' | grep mpssd | head -1`" = "mpssd" ]; then > + echo -e $"MPSSD already running! " > + success > + echo > + return 0; > + fi > + > + # Start the daemon > + echo -n $"Starting MPSSD" > + $exec & > + RETVAL=$? > + if [ $RETVAL -ne 0 ]; then > + failure > + else > + success > + fi > + echo > + > + sleep 5 > + > + # Boot the cards > + if [ $RETVAL -eq 0 ]; then > + for f in $sysfs/* > + do > + echo -ne "Booting "`basename $f`" " > + echo "boot:linux:mic/uos.img:mic/`basename $f`.image" > $f/state > + RETVAL=$? > + if [ $RETVAL -ne 0 ]; then > + failure > + else > + success > + fi > + echo > + done > + fi > + > + # Wait till ping works > + if [ $RETVAL -eq 0 ]; then > + for f in $sysfs/* > + do > + count=100 > + ipaddr=`cat $f/cmdline` > + ipaddr=${ipaddr#*address,} > + ipaddr=`echo $ipaddr | cut -d, -f1 | cut -d\; -f1` > + > + while [ $count -ge 0 ] > + do > + echo -e "Pinging "`basename $f`" " > + ping -c 1 $ipaddr &> /dev/null > + RETVAL=$? > + if [ $RETVAL -eq 0 ]; then > + success > + break > + fi > + sleep 1 > + count=`expr $count - 1` > + done > + if [ $RETVAL -ne 0 ]; then > + failure > + else > + success > + fi > + echo > + done > + fi > + return $RETVAL > +} > + > +stop() > +{ > + echo -e $"Shutting down MPSS Stack: " > + > + # Bail out if module is unloaded > + if [ ! -d "$sysfs" ]; then > + echo -n $"Module unloaded " > + killall -9 mpssd 2>/dev/null > + success > + echo > + return 0 > + fi > + > + # Shut down the cards > + for f in $sysfs/* > + do > + echo -e "Shutting down `basename $f` " > + echo "shutdown" > $f/state 2>/dev/null > + done > + > + # Wait for the cards to go offline > + for f in $sysfs/* > + do > + while [ "`cat $f/state`" != "offline" ] > + do > + sleep 1 > + echo -e "Waiting for "`basename $f`" to go offline" > + done > + done > + > + # Display the status of the cards > + for f in $sysfs/* > + do > + echo -e ""`basename $f`" state: "`cat $f/state`"" > + done > + > + sleep 5 > + > + # Kill MPSSD now > + echo -n $"Killing MPSSD" > + killall -9 mpssd 2>/dev/null > + RETVAL=$? > + if [ $RETVAL -ne 0 ]; then > + failure > + else > + success > + fi > + echo > + return $RETVAL > +} > + > +restart() > +{ > + stop > + sleep 5 > + start > +} > + > +status() > +{ > + if [ -d "$sysfs" ]; then > + for f in $sysfs/* > + do > + echo -e ""`basename $f`" state: "`cat $f/state`"" > + done > + fi > + > + if [ "`ps -e | awk '{print $4}' | grep mpssd | head -n 1`" = "mpssd" ]; then > + echo "mpssd is running" > + else > + echo "mpssd is stopped" > + fi > + return 0 > +} > + > +unload() > +{ > + if [ ! -d "$sysfs" ]; then > + echo -n $"No MIC_HOST Module: " > + killall -9 mpssd 2>/dev/null > + success > + echo > + return > + fi > + > + stop > + RETVAL=$? > + > + sleep 5 > + echo -n $"Removing MIC_HOST Module: " > + > + if [ $RETVAL = 0 ]; then > + sleep 1 > + modprobe -r mic_host > + RETVAL=$? > + fi > + > + if [ $RETVAL -ne 0 ]; then > + failure > + else > + success > + fi > + echo > + return $RETVAL > +} > + > +case $1 in > + start) > + start > + ;; > + stop) > + stop > + ;; > + restart) > + restart > + ;; > + status) > + status > + ;; > + unload) > + unload > + ;; > + *) > + echo $"Usage: $0 {start|stop|restart|status|unload}" > + exit 2 > +esac > + > +exit $? > diff --git a/Documentation/mic/mpssd/mpssd.c b/Documentation/mic/mpssd/mpssd.c > new file mode 100644 > index 0000000..3bc34cb > --- /dev/null > +++ b/Documentation/mic/mpssd/mpssd.c > @@ -0,0 +1,1689 @@ > +/* > + * Intel MIC Platform Software Stack (MPSS) > + * > + * Copyright(c) 2013 Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License, version 2, as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Intel MIC User Space Tools. > + */ > + > +#define _GNU_SOURCE > + > +#include <stdlib.h> > +#include <fcntl.h> > +#include <getopt.h> > +#include <assert.h> > +#include <unistd.h> > +#include <stdbool.h> > +#include <signal.h> > +#include <poll.h> > +#include <features.h> > +#include <sys/types.h> > +#include <sys/stat.h> > +#include <sys/mman.h> > +#include <sys/socket.h> > +#include <linux/virtio_ring.h> > +#include <linux/virtio_net.h> > +#include <linux/virtio_console.h> > +#include <linux/virtio_blk.h> > +#include <linux/version.h> > +#include "mpssd.h" > +#include <linux/mic_ioctl.h> > +#include <linux/mic_common.h> > + > +static void init_mic(struct mic_info *mic); > + > +static FILE *logfp; > +static struct mic_info mic_list; > + > +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) > + > +#define min_t(type, x, y) ({ \ > + type __min1 = (x); \ > + type __min2 = (y); \ > + __min1 < __min2 ? __min1 : __min2; }) > + > +/* align addr on a size boundary - adjust address up/down if needed */ > +#define _ALIGN_UP(addr, size) (((addr)+((size)-1))&(~((size)-1))) > +#define _ALIGN_DOWN(addr, size) ((addr)&(~((size)-1))) > + > +/* align addr on a size boundary - adjust address up if needed */ > +#define _ALIGN(addr, size) _ALIGN_UP(addr, size) > + > +/* to align the pointer to the (next) page boundary */ > +#define PAGE_ALIGN(addr) _ALIGN(addr, PAGE_SIZE) > + > +#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) > + > +/* Insert REP NOP (PAUSE) in busy-wait loops. */ > +static inline void cpu_relax(void) > +{ > + asm volatile("rep; nop" : : : "memory"); > +} > + > +#define GSO_ENABLED 1 > +#define MAX_GSO_SIZE (64 * 1024) > +#define ETH_H_LEN 14 > +#define MAX_NET_PKT_SIZE (_ALIGN_UP(MAX_GSO_SIZE + ETH_H_LEN, 64)) > +#define MIC_DEVICE_PAGE_END 0x1000 > + > +#ifndef VIRTIO_NET_HDR_F_DATA_VALID > +#define VIRTIO_NET_HDR_F_DATA_VALID 2 /* Csum is valid */ > +#endif > + > +static struct { > + struct mic_device_desc dd; > + struct mic_vqconfig vqconfig[2]; > + __u32 host_features, guest_acknowledgements; > + struct virtio_console_config cons_config; > +} virtcons_dev_page = { > + .dd = { > + .type = VIRTIO_ID_CONSOLE, > + .num_vq = ARRAY_SIZE(virtcons_dev_page.vqconfig), > + .feature_len = sizeof(virtcons_dev_page.host_features), > + .config_len = sizeof(virtcons_dev_page.cons_config), > + }, > + .vqconfig[0] = { > + .num = htole16(MIC_VRING_ENTRIES), > + }, > + .vqconfig[1] = { > + .num = htole16(MIC_VRING_ENTRIES), > + }, > +}; > + > +static struct { > + struct mic_device_desc dd; > + struct mic_vqconfig vqconfig[2]; > + __u32 host_features, guest_acknowledgements; > + struct virtio_net_config net_config; > +} virtnet_dev_page = { > + .dd = { > + .type = VIRTIO_ID_NET, > + .num_vq = ARRAY_SIZE(virtnet_dev_page.vqconfig), > + .feature_len = sizeof(virtnet_dev_page.host_features), > + .config_len = sizeof(virtnet_dev_page.net_config), > + }, > + .vqconfig[0] = { > + .num = htole16(MIC_VRING_ENTRIES), > + }, > + .vqconfig[1] = { > + .num = htole16(MIC_VRING_ENTRIES), > + }, > +#if GSO_ENABLED > + .host_features = htole32( > + 1 << VIRTIO_NET_F_CSUM | > + 1 << VIRTIO_NET_F_GSO | > + 1 << VIRTIO_NET_F_GUEST_TSO4 | > + 1 << VIRTIO_NET_F_GUEST_TSO6 | > + 1 << VIRTIO_NET_F_GUEST_ECN | > + 1 << VIRTIO_NET_F_GUEST_UFO), > +#else > + .host_features = 0, > +#endif > +}; > + > +static const char *mic_config_dir = "/etc/sysconfig/mic"; > +static const char *virtblk_backend = "VIRTBLK_BACKEND"; > +static struct { > + struct mic_device_desc dd; > + struct mic_vqconfig vqconfig[1]; > + __u32 host_features, guest_acknowledgements; > + struct virtio_blk_config blk_config; > +} virtblk_dev_page = { > + .dd = { > + .type = VIRTIO_ID_BLOCK, > + .num_vq = ARRAY_SIZE(virtblk_dev_page.vqconfig), > + .feature_len = sizeof(virtblk_dev_page.host_features), > + .config_len = sizeof(virtblk_dev_page.blk_config), > + }, > + .vqconfig[0] = { > + .num = htole16(MIC_VRING_ENTRIES), > + }, > + .host_features = > + htole32(1<<VIRTIO_BLK_F_SEG_MAX), > + .blk_config = { > + .seg_max = htole32(MIC_VRING_ENTRIES - 2), > + .capacity = htole64(0), > + } > +}; > + > +static char *myname; > + > +static int > +tap_configure(struct mic_info *mic, char *dev) > +{ > + pid_t pid; > + char *ifargv[7]; > + char ipaddr[IFNAMSIZ]; > + int ret = 0; > + > + pid = fork(); > + if (pid == 0) { > + ifargv[0] = "ip"; > + ifargv[1] = "link"; > + ifargv[2] = "set"; > + ifargv[3] = dev; > + ifargv[4] = "up"; > + ifargv[5] = NULL; > + mpsslog("Configuring %s\n", dev); > + ret = execvp("ip", ifargv); > + if (ret < 0) { > + mpsslog("%s execvp failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + } > + if (pid < 0) { > + mpsslog("%s fork failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + > + ret = waitpid(pid, NULL, 0); > + if (ret < 0) { > + mpsslog("%s waitpid failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + > + snprintf(ipaddr, IFNAMSIZ, "172.31.%d.254/24", mic->id); > + > + pid = fork(); > + if (pid == 0) { > + ifargv[0] = "ip"; > + ifargv[1] = "addr"; > + ifargv[2] = "add"; > + ifargv[3] = ipaddr; > + ifargv[4] = "dev"; > + ifargv[5] = dev; > + ifargv[6] = NULL; > + mpsslog("Configuring %s ipaddr %s\n", dev, ipaddr); > + ret = execvp("ip", ifargv); > + if (ret < 0) { > + mpsslog("%s execvp failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + } > + if (pid < 0) { > + mpsslog("%s fork failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + > + ret = waitpid(pid, NULL, 0); > + if (ret < 0) { > + mpsslog("%s waitpid failed errno %s\n", > + mic->name, strerror(errno)); > + return ret; > + } > + mpsslog("MIC name %s %s %d DONE!\n", > + mic->name, __func__, __LINE__); > + return 0; > +} > + > +static int tun_alloc(struct mic_info *mic, char *dev) > +{ > + struct ifreq ifr; > + int fd, err; > +#if GSO_ENABLED > + unsigned offload; > +#endif > + fd = open("/dev/net/tun", O_RDWR); > + if (fd < 0) { > + mpsslog("Could not open /dev/net/tun %s\n", strerror(errno)); > + goto done; > + } > + > + memset(&ifr, 0, sizeof(ifr)); > + > + ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_VNET_HDR; > + if (*dev) > + strncpy(ifr.ifr_name, dev, IFNAMSIZ); > + > + err = ioctl(fd, TUNSETIFF, (void *) &ifr); > + if (err < 0) { > + mpsslog("%s %s %d TUNSETIFF failed %s\n", > + mic->name, __func__, __LINE__, strerror(errno)); > + close(fd); > + return err; > + } > +#if GSO_ENABLED > + offload = TUN_F_CSUM | TUN_F_TSO4 | TUN_F_TSO6 | > + TUN_F_TSO_ECN | TUN_F_UFO; > + > + err = ioctl(fd, TUNSETOFFLOAD, offload); > + if (err < 0) { > + mpsslog("%s %s %d TUNSETOFFLOAD failed %s\n", > + mic->name, __func__, __LINE__, strerror(errno)); > + close(fd); > + return err; > + } > +#endif > + strcpy(dev, ifr.ifr_name); > + mpsslog("Created TAP %s\n", dev); > +done: > + return fd; > +} > + > +#define NET_FD_VIRTIO_NET 0 > +#define NET_FD_TUN 1 > +#define MAX_NET_FD 2 > + > +static void * * > +get_dp(struct mic_info *mic, int type) > +{ > + switch (type) { > + case VIRTIO_ID_CONSOLE: > + return &mic->mic_console.console_dp; > + case VIRTIO_ID_NET: > + return &mic->mic_net.net_dp; > + case VIRTIO_ID_BLOCK: > + return &mic->mic_virtblk.block_dp; > + } > + mpsslog("%s %s %d not found\n", mic->name, __func__, type); > + assert(0); > + return NULL; > +} > + > +static struct mic_device_desc *get_device_desc(struct mic_info *mic, int type) > +{ > + struct mic_device_desc *d; > + int i; > + void *dp = *get_dp(mic, type); > + > + for (i = mic_aligned_size(struct mic_bootparam); i < PAGE_SIZE; > + i += mic_total_desc_size(d)) { > + d = dp + i; > + > + /* End of list */ > + if (d->type == 0) > + break; > + > + if (d->type == -1) > + continue; > + > + mpsslog("%s %s d-> type %d d %p\n", > + mic->name, __func__, d->type, d); > + > + if (d->type == (__u8)type) > + return d; > + } > + mpsslog("%s %s %d not found\n", mic->name, __func__, type); > + assert(0); > + return NULL; > +} > + > +/* See comments in vhost.c for explanation of next_desc() */ > +static unsigned next_desc(struct vring_desc *desc) > +{ > + unsigned int next; > + > + if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT)) > + return -1U; > + next = le16toh(desc->next); > + return next; > +} > + > +/* Sum up all the IOVEC length */ > +static ssize_t > +sum_iovec_len(struct mic_copy_desc *copy) > +{ > + ssize_t sum = 0; > + int i; > + > + for (i = 0; i < copy->iovcnt; i++) > + sum += copy->iov[i].iov_len; > + return sum; > +} > + > +static inline void verify_out_len(struct mic_info *mic, > + struct mic_copy_desc *copy) > +{ > + if (copy->out_len != sum_iovec_len(copy)) { > + mpsslog("%s %s %d BUG copy->out_len 0x%x len 0x%x\n", > + mic->name, __func__, __LINE__, > + copy->out_len, sum_iovec_len(copy)); > + assert(copy->out_len == sum_iovec_len(copy)); > + } > +} > + > +/* Display an iovec */ > +static void > +disp_iovec(struct mic_info *mic, struct mic_copy_desc *copy, > + const char *s, int line) > +{ > + int i; > + > + for (i = 0; i < copy->iovcnt; i++) > + mpsslog("%s %s %d copy->iov[%d] addr %p len 0x%lx\n", > + mic->name, s, line, i, > + copy->iov[i].iov_base, copy->iov[i].iov_len); > +} > + > +static inline __u16 read_avail_idx(struct mic_vring *vr) > +{ > + return ACCESS_ONCE(vr->info->avail_idx); > +} > + > +static inline void txrx_prepare(int type, bool tx, struct mic_vring *vr, > + struct mic_copy_desc *copy, ssize_t len) > +{ > + copy->vr_idx = tx ? 0 : 1; > + copy->update_used = true; > + if (type == VIRTIO_ID_NET) > + copy->iov[1].iov_len = len - sizeof(struct virtio_net_hdr); > + else > + copy->iov[0].iov_len = len; > +} > + > +/* Central API which triggers the copies */ > +static int > +mic_virtio_copy(struct mic_info *mic, int fd, > + struct mic_vring *vr, struct mic_copy_desc *copy) > +{ > + int ret; > + > + ret = ioctl(fd, MIC_VIRTIO_COPY_DESC, copy); > + if (ret) { > + mpsslog("%s %s %d errno %s ret %d\n", > + mic->name, __func__, __LINE__, > + strerror(errno), ret); > + } > + return ret; > +} > + > +/* > + * This initialization routine requires at least one > + * vring i.e. vr0. vr1 is optional. > + */ > +static void * > +init_vr(struct mic_info *mic, int fd, int type, > + struct mic_vring *vr0, struct mic_vring *vr1, int num_vq) > +{ > + int vr_size; > + char *va; > + > + vr_size = PAGE_ALIGN(vring_size(MIC_VRING_ENTRIES, > + MIC_VIRTIO_RING_ALIGN) + sizeof(struct _mic_vring_info)); > + va = mmap(NULL, MIC_DEVICE_PAGE_END + vr_size * num_vq, > + PROT_READ, MAP_SHARED, fd, 0); > + if (MAP_FAILED == va) { > + mpsslog("%s %s %d mmap failed errno %s\n", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + goto done; > + } > + *get_dp(mic, type) = (void *)va; > + vr0->va = (struct mic_vring *)&va[MIC_DEVICE_PAGE_END]; > + vr0->info = vr0->va + > + vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN); > + vring_init(&vr0->vr, > + MIC_VRING_ENTRIES, vr0->va, MIC_VIRTIO_RING_ALIGN); > + mpsslog("%s %s vr0 %p vr0->info %p vr_size 0x%x vring 0x%x ", > + __func__, mic->name, vr0->va, vr0->info, vr_size, > + vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN)); > + mpsslog("magic 0x%x expected 0x%x\n", > + vr0->info->magic, MIC_MAGIC + type + 0); > + assert(vr0->info->magic == MIC_MAGIC + type + 0); > + if (vr1) { > + vr1->va = (struct mic_vring *) > + &va[MIC_DEVICE_PAGE_END + vr_size]; > + vr1->info = vr1->va + vring_size(MIC_VRING_ENTRIES, > + MIC_VIRTIO_RING_ALIGN); > + vring_init(&vr1->vr, > + MIC_VRING_ENTRIES, vr1->va, MIC_VIRTIO_RING_ALIGN); > + mpsslog("%s %s vr1 %p vr1->info %p vr_size 0x%x vring 0x%x ", > + __func__, mic->name, vr1->va, vr1->info, vr_size, > + vring_size(MIC_VRING_ENTRIES, MIC_VIRTIO_RING_ALIGN)); > + mpsslog("magic 0x%x expected 0x%x\n", > + vr1->info->magic, MIC_MAGIC + type + 1); > + assert(vr1->info->magic == MIC_MAGIC + type + 1); > + } > +done: > + return va; > +} > + > +static void > +uninit_vr(struct mic_info *mic, int num_vq) > +{ > + int vr_size, ret; > + > + vr_size = PAGE_ALIGN(vring_size(MIC_VRING_ENTRIES, > + MIC_VIRTIO_RING_ALIGN) + sizeof(struct _mic_vring_info)); > + ret = munmap(mic->mic_virtblk.block_dp, > + MIC_DEVICE_PAGE_END + vr_size * num_vq); > + if (ret < 0) > + mpsslog("%s munmap errno %d\n", mic->name, errno); > +} > + > +static void > +wait_for_card_driver(struct mic_info *mic, int fd, int type) > +{ > + struct pollfd pollfd; > + int err; > + struct mic_device_desc *desc = get_device_desc(mic, type); > + > + pollfd.fd = fd; > + mpsslog("%s %s Waiting .... desc-> type %d status 0x%x\n", > + mic->name, __func__, type, desc->status); > + while (1) { > + pollfd.events = POLLIN; > + pollfd.revents = 0; > + err = poll(&pollfd, 1, -1); > + if (err < 0) { > + mpsslog("%s %s poll failed %s\n", > + mic->name, __func__, strerror(errno)); > + continue; > + } > + > + if (pollfd.revents) { > + mpsslog("%s %s Waiting... desc-> type %d status 0x%x\n", > + mic->name, __func__, type, desc->status); > + if (desc->status & VIRTIO_CONFIG_S_DRIVER_OK) { > + mpsslog("%s %s poll.revents %d\n", > + mic->name, __func__, pollfd.revents); > + mpsslog("%s %s desc-> type %d status 0x%x\n", > + mic->name, __func__, type, > + desc->status); > + break; > + } > + } > + } > +} > + > +/* Spin till we have some descriptors */ > +static void > +wait_for_descriptors(struct mic_info *mic, struct mic_vring *vr) > +{ > + __u16 avail_idx = read_avail_idx(vr); > + > + while (avail_idx == le16toh(ACCESS_ONCE(vr->vr.avail->idx))) { > +#ifdef DEBUG > + mpsslog("%s %s waiting for desc avail %d info_avail %d\n", > + mic->name, __func__, > + le16toh(vr->vr.avail->idx), vr->info->avail_idx); > +#endif > + cpu_relax(); > + } > +} > + > +static void * > +virtio_net(void *arg) > +{ > + static __u8 vnet_hdr[2][sizeof(struct virtio_net_hdr)]; > + static __u8 vnet_buf[2][MAX_NET_PKT_SIZE] __aligned(64); > + struct iovec vnet_iov[2][2] = { > + { { .iov_base = vnet_hdr[0], .iov_len = sizeof(vnet_hdr[0]) }, > + { .iov_base = vnet_buf[0], .iov_len = sizeof(vnet_buf[0]) } }, > + { { .iov_base = vnet_hdr[1], .iov_len = sizeof(vnet_hdr[1]) }, > + { .iov_base = vnet_buf[1], .iov_len = sizeof(vnet_buf[1]) } }, > + }; > + struct iovec *iov0 = vnet_iov[0], *iov1 = vnet_iov[1]; > + struct mic_info *mic = (struct mic_info *)arg; > + char if_name[IFNAMSIZ]; > + struct pollfd net_poll[MAX_NET_FD]; > + struct mic_vring tx_vr, rx_vr; > + struct mic_copy_desc copy; > + struct mic_device_desc *desc; > + int err; > + > + snprintf(if_name, IFNAMSIZ, "mic%d", mic->id); > + mic->mic_net.tap_fd = tun_alloc(mic, if_name); > + if (mic->mic_net.tap_fd < 0) > + goto done; > + > + if (tap_configure(mic, if_name)) > + goto done; > + mpsslog("MIC name %s id %d\n", mic->name, mic->id); > + > + net_poll[NET_FD_VIRTIO_NET].fd = mic->mic_net.virtio_net_fd; > + net_poll[NET_FD_VIRTIO_NET].events = POLLIN; > + net_poll[NET_FD_TUN].fd = mic->mic_net.tap_fd; > + net_poll[NET_FD_TUN].events = POLLIN; > + > + if (MAP_FAILED == init_vr(mic, mic->mic_net.virtio_net_fd, > + VIRTIO_ID_NET, &tx_vr, &rx_vr, > + virtnet_dev_page.dd.num_vq)) { > + mpsslog("%s init_vr failed %s\n", > + mic->name, strerror(errno)); > + goto done; > + } > + > + copy.iovcnt = 2; > + desc = get_device_desc(mic, VIRTIO_ID_NET); > + > + while (1) { > + ssize_t len; > + > + net_poll[NET_FD_VIRTIO_NET].revents = 0; > + net_poll[NET_FD_TUN].revents = 0; > + > + /* Start polling for data from tap and virtio net */ > + err = poll(net_poll, 2, -1); > + if (err < 0) { > + mpsslog("%s poll failed %s\n", > + __func__, strerror(errno)); > + continue; > + } > + if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK)) > + wait_for_card_driver(mic, mic->mic_net.virtio_net_fd, > + VIRTIO_ID_NET); > + /* > + * Check if there is data to be read from TUN and write to > + * virtio net fd if there is. > + */ > + if (net_poll[NET_FD_TUN].revents & POLLIN) { > + copy.iov = iov0; > + len = readv(net_poll[NET_FD_TUN].fd, > + copy.iov, copy.iovcnt); > + if (len > 0) { > + struct virtio_net_hdr *hdr > + = (struct virtio_net_hdr *) vnet_hdr[0]; > + > + /* Disable checksums on the card since we are on > + a reliable PCIe link */ > + hdr->flags |= VIRTIO_NET_HDR_F_DATA_VALID; > +#ifdef DEBUG > + mpsslog("%s %s %d hdr->flags 0x%x ", mic->name, > + __func__, __LINE__, hdr->flags); > + mpsslog("copy.out_len %d hdr->gso_type 0x%x\n", > + copy.out_len, hdr->gso_type); > +#endif > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, __LINE__); > + mpsslog("%s %s %d read from tap 0x%lx\n", > + mic->name, __func__, __LINE__, > + len); > +#endif > + wait_for_descriptors(mic, &tx_vr); > + txrx_prepare(VIRTIO_ID_NET, 1, &tx_vr, ©, > + len); > + > + err = mic_virtio_copy(mic, > + mic->mic_net.virtio_net_fd, &tx_vr, > + ©); > + if (err < 0) { > + mpsslog("%s %s %d mic_virtio_copy %s\n", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + } > + if (!err) > + verify_out_len(mic, ©); > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, __LINE__); > + mpsslog("%s %s %d wrote to net 0x%lx\n", > + mic->name, __func__, __LINE__, > + sum_iovec_len(©)); > +#endif > + /* Reinitialize IOV for next run */ > + iov0[1].iov_len = MAX_NET_PKT_SIZE; > + } else if (len < 0) { > + disp_iovec(mic, ©, __func__, __LINE__); > + mpsslog("%s %s %d read failed %s ", mic->name, > + __func__, __LINE__, strerror(errno)); > + mpsslog("cnt %d sum %d\n", > + copy.iovcnt, sum_iovec_len(©)); > + } > + } > + > + /* > + * Check if there is data to be read from virtio net and > + * write to TUN if there is. > + */ > + if (net_poll[NET_FD_VIRTIO_NET].revents & POLLIN) { > + while (rx_vr.info->avail_idx != > + le16toh(rx_vr.vr.avail->idx)) { > + copy.iov = iov1; > + txrx_prepare(VIRTIO_ID_NET, 0, &rx_vr, ©, > + MAX_NET_PKT_SIZE > + + sizeof(struct virtio_net_hdr)); > + > + err = mic_virtio_copy(mic, > + mic->mic_net.virtio_net_fd, &rx_vr, > + ©); > + if (!err) { > +#ifdef DEBUG > + struct virtio_net_hdr *hdr > + = (struct virtio_net_hdr *) > + vnet_hdr[1]; > + > + mpsslog("%s %s %d hdr->flags 0x%x, ", > + mic->name, __func__, __LINE__, > + hdr->flags); > + mpsslog("out_len %d gso_type 0x%x\n", > + copy.out_len, > + hdr->gso_type); > +#endif > + /* Set the correct output iov_len */ > + iov1[1].iov_len = copy.out_len - > + sizeof(struct virtio_net_hdr); > + verify_out_len(mic, ©); > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, > + __LINE__); > + mpsslog("%s %s %d ", > + mic->name, __func__, __LINE__); > + mpsslog("read from net 0x%lx\n", > + sum_iovec_len(copy)); > +#endif > + len = writev(net_poll[NET_FD_TUN].fd, > + copy.iov, copy.iovcnt); > + if (len != sum_iovec_len(©)) { > + mpsslog("Tun write failed %s ", > + strerror(errno)); > + mpsslog("len 0x%x ", len); > + mpsslog("read_len 0x%x\n", > + sum_iovec_len(©)); > + } else { > +#ifdef DEBUG > + disp_iovec(mic, ©, __func__, > + __LINE__); > + mpsslog("%s %s %d ", > + mic->name, __func__, > + __LINE__); > + mpsslog("wrote to tap 0x%lx\n", > + len); > +#endif > + } > + } else { > + mpsslog("%s %s %d mic_virtio_copy %s\n", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + break; > + } > + } > + } > + if (net_poll[NET_FD_VIRTIO_NET].revents & POLLERR) { > + mpsslog("%s: %s: POLLERR\n", __func__, mic->name); > + sleep(1); > + } > + } > +done: > + pthread_exit(NULL); > +} > + > +/* virtio_console */ > +#define VIRTIO_CONSOLE_FD 0 > +#define MONITOR_FD (VIRTIO_CONSOLE_FD + 1) > +#define MAX_CONSOLE_FD (MONITOR_FD + 1) /* must be the last one + 1 */ > +#define MAX_BUFFER_SIZE PAGE_SIZE > + > +static void * > +virtio_console(void *arg) > +{ > + static __u8 vcons_buf[2][PAGE_SIZE]; > + struct iovec vcons_iov[2] = { > + { .iov_base = vcons_buf[0], .iov_len = sizeof(vcons_buf[0]) }, > + { .iov_base = vcons_buf[1], .iov_len = sizeof(vcons_buf[1]) }, > + }; > + struct iovec *iov0 = &vcons_iov[0], *iov1 = &vcons_iov[1]; > + struct mic_info *mic = (struct mic_info *)arg; > + int err; > + struct pollfd console_poll[MAX_CONSOLE_FD]; > + int pty_fd; > + char *pts_name; > + ssize_t len; > + struct mic_vring tx_vr, rx_vr; > + struct mic_copy_desc copy; > + struct mic_device_desc *desc; > + > + pty_fd = posix_openpt(O_RDWR); > + if (pty_fd < 0) { > + mpsslog("can't open a pseudoterminal master device: %s\n", > + strerror(errno)); > + goto _return; > + } > + pts_name = ptsname(pty_fd); > + if (pts_name == NULL) { > + mpsslog("can't get pts name\n"); > + goto _close_pty; > + } > + printf("%s console message goes to %s\n", mic->name, pts_name); > + mpsslog("%s console message goes to %s\n", mic->name, pts_name); > + err = grantpt(pty_fd); > + if (err < 0) { > + mpsslog("can't grant access: %s %s\n", > + pts_name, strerror(errno)); > + goto _close_pty; > + } > + err = unlockpt(pty_fd); > + if (err < 0) { > + mpsslog("can't unlock a pseudoterminal: %s %s\n", > + pts_name, strerror(errno)); > + goto _close_pty; > + } > + console_poll[MONITOR_FD].fd = pty_fd; > + console_poll[MONITOR_FD].events = POLLIN; > + > + console_poll[VIRTIO_CONSOLE_FD].fd = mic->mic_console.virtio_console_fd; > + console_poll[VIRTIO_CONSOLE_FD].events = POLLIN; > + > + if (MAP_FAILED == init_vr(mic, mic->mic_console.virtio_console_fd, > + VIRTIO_ID_CONSOLE, &tx_vr, &rx_vr, > + virtcons_dev_page.dd.num_vq)) { > + mpsslog("%s init_vr failed %s\n", > + mic->name, strerror(errno)); > + goto _close_pty; > + } > + > + copy.iovcnt = 1; > + desc = get_device_desc(mic, VIRTIO_ID_CONSOLE); > + > + for (;;) { > + console_poll[MONITOR_FD].revents = 0; > + console_poll[VIRTIO_CONSOLE_FD].revents = 0; > + err = poll(console_poll, MAX_CONSOLE_FD, -1); > + if (err < 0) { > + mpsslog("%s %d: poll failed: %s\n", __func__, __LINE__, > + strerror(errno)); > + continue; > + } > + if (!(desc->status & VIRTIO_CONFIG_S_DRIVER_OK)) > + wait_for_card_driver(mic, > + mic->mic_console.virtio_console_fd, > + VIRTIO_ID_CONSOLE); > + > + if (console_poll[MONITOR_FD].revents & POLLIN) { > + copy.iov = iov0; > + len = readv(pty_fd, copy.iov, copy.iovcnt); > + if (len > 0) { > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, __LINE__); > + mpsslog("%s %s %d read from tap 0x%lx\n", > + mic->name, __func__, __LINE__, > + len); > +#endif > + wait_for_descriptors(mic, &tx_vr); > + txrx_prepare(VIRTIO_ID_CONSOLE, 1, &tx_vr, > + ©, len); > + > + err = mic_virtio_copy(mic, > + mic->mic_console.virtio_console_fd, > + &tx_vr, ©); > + if (err < 0) { > + mpsslog("%s %s %d mic_virtio_copy %s\n", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + } > + if (!err) > + verify_out_len(mic, ©); > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, __LINE__); > + mpsslog("%s %s %d wrote to net 0x%lx\n", > + mic->name, __func__, __LINE__, > + sum_iovec_len(copy)); > +#endif > + /* Reinitialize IOV for next run */ > + iov0->iov_len = PAGE_SIZE; > + } else if (len < 0) { > + disp_iovec(mic, ©, __func__, __LINE__); > + mpsslog("%s %s %d read failed %s ", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + mpsslog("cnt %d sum %d\n", > + copy.iovcnt, sum_iovec_len(©)); > + } > + } > + > + if (console_poll[VIRTIO_CONSOLE_FD].revents & POLLIN) { > + while (rx_vr.info->avail_idx != > + le16toh(rx_vr.vr.avail->idx)) { > + copy.iov = iov1; > + txrx_prepare(VIRTIO_ID_CONSOLE, 0, &rx_vr, > + ©, PAGE_SIZE); > + > + err = mic_virtio_copy(mic, > + mic->mic_console.virtio_console_fd, > + &rx_vr, ©); > + if (!err) { > + /* Set the correct output iov_len */ > + iov1->iov_len = copy.out_len; > + verify_out_len(mic, ©); > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, > + __LINE__); > + mpsslog("%s %s %d ", > + mic->name, __func__, __LINE__); > + mpsslog("read from net 0x%lx\n", > + sum_iovec_len(copy)); > +#endif > + len = writev(pty_fd, > + copy.iov, copy.iovcnt); > + if (len != sum_iovec_len(©)) { > + mpsslog("Tun write failed %s ", > + strerror(errno)); > + mpsslog("len 0x%x ", len); > + mpsslog("read_len 0x%x\n", > + sum_iovec_len(©)); > + } else { > +#ifdef DEBUG > + disp_iovec(mic, copy, __func__, > + __LINE__); > + mpsslog("%s %s %d ", > + mic->name, __func__, > + __LINE__); > + mpsslog("wrote to tap 0x%lx\n", > + len); > +#endif > + } > + } else { > + mpsslog("%s %s %d mic_virtio_copy %s\n", > + mic->name, __func__, __LINE__, > + strerror(errno)); > + break; > + } > + } > + } > + if (console_poll[NET_FD_VIRTIO_NET].revents & POLLERR) { > + mpsslog("%s: %s: POLLERR\n", __func__, mic->name); > + sleep(1); > + } > + } > +_close_pty: > + close(pty_fd); > +_return: > + pthread_exit(NULL); > +} > + > +static void > +add_virtio_device(struct mic_info *mic, struct mic_device_desc *dd) > +{ > + char path[PATH_MAX]; > + int fd, err; > + > + snprintf(path, PATH_MAX, "/dev/mic%d", mic->id); > + fd = open(path, O_RDWR); > + if (fd < 0) { > + mpsslog("Could not open %s %s\n", path, strerror(errno)); > + return; > + } > + > + err = ioctl(fd, MIC_VIRTIO_ADD_DEVICE, dd); > + if (err < 0) { > + mpsslog("Could not add %d %s\n", dd->type, strerror(errno)); > + close(fd); > + return; > + } > + switch (dd->type) { > + case VIRTIO_ID_NET: > + mic->mic_net.virtio_net_fd = fd; > + mpsslog("Added VIRTIO_ID_NET for %s\n", mic->name); > + break; > + case VIRTIO_ID_CONSOLE: > + mic->mic_console.virtio_console_fd = fd; > + mpsslog("Added VIRTIO_ID_CONSOLE for %s\n", mic->name); > + break; > + case VIRTIO_ID_BLOCK: > + mic->mic_virtblk.virtio_block_fd = fd; > + mpsslog("Added VIRTIO_ID_BLOCK for %s\n", mic->name); > + break; > + } > +} > + > +static bool > +set_backend_file(struct mic_info *mic) > +{ > + FILE *config; > + char buff[PATH_MAX], *line, *evv, *p; > + > + snprintf(buff, PATH_MAX, "%s/mpssd%03d.conf", mic_config_dir, mic->id); > + config = fopen(buff, "r"); > + if (config == NULL) > + return false; > + do { /* look for "virtblk_backend=XXXX" */ > + line = fgets(buff, PATH_MAX, config); > + if (line == NULL) > + break; > + if (*line == '#') > + continue; > + p = strchr(line, '\n'); > + if (p) > + *p = '\0'; > + } while (strncmp(line, virtblk_backend, strlen(virtblk_backend)) != 0); > + fclose(config); > + if (line == NULL) > + return false; > + evv = strchr(line, '='); > + if (evv == NULL) > + return false; > + mic->mic_virtblk.backend_file = malloc(strlen(evv)); > + if (mic->mic_virtblk.backend_file == NULL) { > + mpsslog("can't allocate memory\n", mic->name, mic->id); > + return false; > + } > + strcpy(mic->mic_virtblk.backend_file, evv + 1); > + return true; > +} > + > +#define SECTOR_SIZE 512 > +static bool > +set_backend_size(struct mic_info *mic) > +{ > + mic->mic_virtblk.backend_size = lseek(mic->mic_virtblk.backend, 0, > + SEEK_END); > + if (mic->mic_virtblk.backend_size < 0) { > + mpsslog("%s: can't seek: %s\n", > + mic->name, mic->mic_virtblk.backend_file); > + return false; > + } > + virtblk_dev_page.blk_config.capacity = > + mic->mic_virtblk.backend_size / SECTOR_SIZE; > + if ((mic->mic_virtblk.backend_size % SECTOR_SIZE) != 0) > + virtblk_dev_page.blk_config.capacity++; > + > + virtblk_dev_page.blk_config.capacity = > + htole64(virtblk_dev_page.blk_config.capacity); > + > + return true; > +} > + > +static bool > +open_backend(struct mic_info *mic) > +{ > + if (!set_backend_file(mic)) > + goto _error_exit; > + mic->mic_virtblk.backend = open(mic->mic_virtblk.backend_file, O_RDWR); > + if (mic->mic_virtblk.backend < 0) { > + mpsslog("%s: can't open: %s\n", mic->name, > + mic->mic_virtblk.backend_file); > + goto _error_free; > + } > + if (!set_backend_size(mic)) > + goto _error_close; > + mic->mic_virtblk.backend_addr = mmap(NULL, > + mic->mic_virtblk.backend_size, > + PROT_READ|PROT_WRITE, MAP_SHARED, > + mic->mic_virtblk.backend, 0L); > + if (mic->mic_virtblk.backend_addr == MAP_FAILED) { > + mpsslog("%s: can't map: %s %s\n", > + mic->name, mic->mic_virtblk.backend_file, > + strerror(errno)); > + goto _error_close; > + } > + return true; > + > + _error_close: > + close(mic->mic_virtblk.backend); > + _error_free: > + free(mic->mic_virtblk.backend_file); > + _error_exit: > + return false; > +} > + > +static void > +close_backend(struct mic_info *mic) > +{ > + munmap(mic->mic_virtblk.backend_addr, mic->mic_virtblk.backend_size); > + close(mic->mic_virtblk.backend); > + free(mic->mic_virtblk.backend_file); > +} > + > +static bool > +start_virtblk(struct mic_info *mic, struct mic_vring *vring) > +{ > + if (((__u64)&virtblk_dev_page.blk_config % 8) != 0) { > + mpsslog("%s: blk_config is not 8 byte aligned.\n", > + mic->name); > + return false; > + } > + add_virtio_device(mic, &virtblk_dev_page.dd); > + if (MAP_FAILED == init_vr(mic, mic->mic_virtblk.virtio_block_fd, > + VIRTIO_ID_BLOCK, vring, NULL, virtblk_dev_page.dd.num_vq)) { > + mpsslog("%s init_vr failed %s\n", > + mic->name, strerror(errno)); > + return false; > + } > + return true; > +} > + > +static void > +stop_virtblk(struct mic_info *mic) > +{ > + uninit_vr(mic, virtblk_dev_page.dd.num_vq); > + close(mic->mic_virtblk.virtio_block_fd); > +} > + > +static __u8 > +header_error_check(struct vring_desc *desc) > +{ > + if (le32toh(desc->len) != sizeof(struct virtio_blk_outhdr)) { > + mpsslog("%s() %d: length is not sizeof(virtio_blk_outhd)\n", > + __func__, __LINE__); > + return -EIO; > + } > + if (!(le16toh(desc->flags) & VRING_DESC_F_NEXT)) { > + mpsslog("%s() %d: alone\n", > + __func__, __LINE__); > + return -EIO; > + } > + if (le16toh(desc->flags) & VRING_DESC_F_WRITE) { > + mpsslog("%s() %d: not read\n", > + __func__, __LINE__); > + return -EIO; > + } > + return 0; > +} > + > +static int > +read_header(int fd, struct virtio_blk_outhdr *hdr, __u32 desc_idx) > +{ > + struct iovec iovec; > + struct mic_copy_desc copy; > + > + iovec.iov_len = sizeof(*hdr); > + iovec.iov_base = hdr; > + copy.iov = &iovec; > + copy.iovcnt = 1; > + copy.vr_idx = 0; /* only one vring on virtio_block */ > + copy.update_used = false; /* do not update used index */ > + return ioctl(fd, MIC_VIRTIO_COPY_DESC, ©); > +} > + > +static int > +transfer_blocks(int fd, struct iovec *iovec, __u32 iovcnt) > +{ > + struct mic_copy_desc copy; > + > + copy.iov = iovec; > + copy.iovcnt = iovcnt; > + copy.vr_idx = 0; /* only one vring on virtio_block */ > + copy.update_used = false; /* do not update used index */ > + return ioctl(fd, MIC_VIRTIO_COPY_DESC, ©); > +} > + > +static __u8 > +status_error_check(struct vring_desc *desc) > +{ > + if (le32toh(desc->len) != sizeof(__u8)) { > + mpsslog("%s() %d: length is not sizeof(status)\n", > + __func__, __LINE__); > + return -EIO; > + } > + return 0; > +} > + > +static int > +write_status(int fd, __u8 *status) > +{ > + struct iovec iovec; > + struct mic_copy_desc copy; > + > + iovec.iov_base = status; > + iovec.iov_len = sizeof(*status); > + copy.iov = &iovec; > + copy.iovcnt = 1; > + copy.vr_idx = 0; /* only one vring on virtio_block */ > + copy.update_used = true; /* Update used index */ > + return ioctl(fd, MIC_VIRTIO_COPY_DESC, ©); > +} > + > +static void * > +virtio_block(void *arg) > +{ > + struct mic_info *mic = (struct mic_info *) arg; > + int ret; > + struct pollfd block_poll; > + struct mic_vring vring; > + __u16 avail_idx; > + __u32 desc_idx; > + struct vring_desc *desc; > + struct iovec *iovec, *piov; > + __u8 status; > + __u32 buffer_desc_idx; > + struct virtio_blk_outhdr hdr; > + void *fos; > + > + for (;;) { /* forever */ > + if (!open_backend(mic)) { /* No virtblk */ > + for (mic->mic_virtblk.signaled = 0; > + !mic->mic_virtblk.signaled;) > + sleep(1); > + continue; > + } > + > + /* backend file is specified. */ > + if (!start_virtblk(mic, &vring)) > + goto _close_backend; > + iovec = malloc(sizeof(*iovec) * > + le32toh(virtblk_dev_page.blk_config.seg_max)); > + if (!iovec) { > + mpsslog("%s: can't alloc iovec: %s\n", > + mic->name, strerror(ENOMEM)); > + goto _stop_virtblk; > + } > + > + block_poll.fd = mic->mic_virtblk.virtio_block_fd; > + block_poll.events = POLLIN; > + for (mic->mic_virtblk.signaled = 0; > + !mic->mic_virtblk.signaled;) { > + block_poll.revents = 0; > + /* timeout in 1 sec to see signaled */ > + ret = poll(&block_poll, 1, 1000); > + if (ret < 0) { > + mpsslog("%s %d: poll failed: %s\n", > + __func__, __LINE__, > + strerror(errno)); > + continue; > + } > + > + if (!(block_poll.revents & POLLIN)) { > +#ifdef DEBUG > + mpsslog("%s %d: block_poll.revents=0x%x\n", > + __func__, __LINE__, block_poll.revents); > + sleep(1); > +#endif > + continue; > + } > + > + /* POLLIN */ > + while (vring.info->avail_idx != > + le16toh(vring.vr.avail->idx)) { > + /* read header element */ > + avail_idx = > + vring.info->avail_idx & > + (vring.vr.num - 1); > + desc_idx = le16toh( > + vring.vr.avail->ring[avail_idx]); > + desc = &vring.vr.desc[desc_idx]; > +#ifdef DEBUG > + mpsslog("%s() %d: avail_idx=%d ", > + __func__, __LINE__, > + vring.info->avail_idx); > + mpsslog("vring.vr.num=%d desc=%p\n", > + vring.vr.num, desc); > +#endif > + status = header_error_check(desc); > + ret = read_header( > + mic->mic_virtblk.virtio_block_fd, > + &hdr, desc_idx); > + if (ret < 0) { > + mpsslog("%s() %d %s: ret=%d %s\n", > + __func__, __LINE__, > + mic->name, ret, > + strerror(errno)); > + break; > + } > + /* buffer element */ > + piov = iovec; > + status = 0; > + fos = mic->mic_virtblk.backend_addr + > + (hdr.sector * SECTOR_SIZE); > + buffer_desc_idx = desc_idx = > + next_desc(desc); > + for (desc = &vring.vr.desc[buffer_desc_idx]; > + desc->flags & VRING_DESC_F_NEXT; > + desc_idx = next_desc(desc), > + desc = &vring.vr.desc[desc_idx]) { > + piov->iov_len = desc->len; > + piov->iov_base = fos; > + piov++; > + fos += desc->len; > + } > + /* Returning NULLs for VIRTIO_BLK_T_GET_ID. */ > + if (hdr.type & ~(VIRTIO_BLK_T_OUT | > + VIRTIO_BLK_T_GET_ID)) { > + /* > + VIRTIO_BLK_T_IN - does not do > + anything. Probably for documenting. > + VIRTIO_BLK_T_SCSI_CMD - for > + virtio_scsi. > + VIRTIO_BLK_T_FLUSH - turned off in > + config space. > + VIRTIO_BLK_T_BARRIER - defined but not > + used in anywhere. > + */ > + mpsslog("%s() %d: type %x ", > + __func__, __LINE__, > + hdr.type); > + mpsslog("is not supported\n"); > + status = -ENOTSUP; > + > + } else { > + ret = transfer_blocks( > + mic->mic_virtblk.virtio_block_fd, > + iovec, > + piov - iovec); > + if (ret < 0 && > + status != 0) > + status = ret; > + } > + /* write status and update used pointer */ > + if (status != 0) > + status = status_error_check(desc); > + ret = write_status( > + mic->mic_virtblk.virtio_block_fd, > + &status); > +#ifdef DEBUG > + mpsslog("%s() %d: write status=%d on desc=%p\n", > + __func__, __LINE__, > + status, desc); > +#endif > + } > + } > + free(iovec); > +_stop_virtblk: > + stop_virtblk(mic); > +_close_backend: > + close_backend(mic); > + } /* forever */ > + > + pthread_exit(NULL); > +} > + > +static void > +reset(struct mic_info *mic) > +{ > +#define RESET_TIMEOUT 120 > + int i = RESET_TIMEOUT; > + setsysfs(mic->name, "state", "reset"); > + while (i) { > + char *state; > + state = readsysfs(mic->name, "state"); > + if (!state) > + goto retry; > + mpsslog("%s: %s %d state %s\n", > + mic->name, __func__, __LINE__, state); > + if ((!strcmp(state, "offline"))) { > + free(state); > + break; > + } > + free(state); > +retry: > + sleep(1); > + i--; > + } > +} > + > +static int > +get_mic_shutdown_status(struct mic_info *mic, char *shutdown_status) > +{ > + if (!strcmp(shutdown_status, "nop")) > + return MIC_NOP; > + if (!strcmp(shutdown_status, "crashed")) > + return MIC_CRASHED; > + if (!strcmp(shutdown_status, "halted")) > + return MIC_HALTED; > + if (!strcmp(shutdown_status, "poweroff")) > + return MIC_POWER_OFF; > + if (!strcmp(shutdown_status, "restart")) > + return MIC_RESTART; > + mpsslog("%s: BUG invalid status %s\n", mic->name, shutdown_status); > + /* Invalid state */ > + assert(0); > +}; > + > +static int get_mic_state(struct mic_info *mic, char *state) > +{ > + if (!strcmp(state, "offline")) > + return MIC_OFFLINE; > + if (!strcmp(state, "online")) > + return MIC_ONLINE; > + if (!strcmp(state, "shutting_down")) > + return MIC_SHUTTING_DOWN; > + if (!strcmp(state, "reset_failed")) > + return MIC_RESET_FAILED; > + mpsslog("%s: BUG invalid state %s\n", mic->name, state); > + /* Invalid state */ > + assert(0); > +}; > + > +static void mic_handle_shutdown(struct mic_info *mic) > +{ > +#define SHUTDOWN_TIMEOUT 60 > + int i = SHUTDOWN_TIMEOUT, ret, stat = 0; > + char *shutdown_status; > + while (i) { > + shutdown_status = readsysfs(mic->name, "shutdown_status"); > + if (!shutdown_status) > + continue; > + mpsslog("%s: %s %d shutdown_status %s\n", > + mic->name, __func__, __LINE__, shutdown_status); > + switch (get_mic_shutdown_status(mic, shutdown_status)) { > + case MIC_RESTART: > + mic->restart = 1; > + case MIC_HALTED: > + case MIC_POWER_OFF: > + case MIC_CRASHED: > + goto reset; > + default: > + break; > + } > + free(shutdown_status); > + sleep(1); > + i--; > + } > +reset: > + ret = kill(mic->pid, SIGTERM); > + mpsslog("%s: %s %d kill pid %d ret %d\n", > + mic->name, __func__, __LINE__, > + mic->pid, ret); > + if (!ret) { > + ret = waitpid(mic->pid, &stat, > + WIFSIGNALED(stat)); > + mpsslog("%s: %s %d waitpid ret %d pid %d\n", > + mic->name, __func__, __LINE__, > + ret, mic->pid); > + } > + if (ret == mic->pid) > + reset(mic); > +} > + > +static void * > +mic_config(void *arg) > +{ > + struct mic_info *mic = (struct mic_info *)arg; > + char *state = NULL; > + char pathname[PATH_MAX]; > + int fd, ret; > + struct pollfd ufds[1]; > + char value[4096]; > + > + snprintf(pathname, PATH_MAX - 1, "%s/%s/%s", > + MICSYSFSDIR, mic->name, "state"); > + > + fd = open(pathname, O_RDONLY); > + if (fd < 0) { > + mpsslog("%s: opening file %s failed %s\n", > + mic->name, pathname, strerror(errno)); > + goto error; > + } > + > + do { > + ret = read(fd, value, sizeof(value)); > + if (ret < 0) { > + mpsslog("%s: Failed to read sysfs entry '%s': %s\n", > + mic->name, pathname, strerror(errno)); > + goto close_error1; > + } > +retry: > + state = readsysfs(mic->name, "state"); > + if (!state) > + goto retry; > + mpsslog("%s: %s %d state %s\n", > + mic->name, __func__, __LINE__, state); > + switch (get_mic_state(mic, state)) { > + case MIC_SHUTTING_DOWN: > + mic_handle_shutdown(mic); > + goto close_error; > + default: > + break; > + } > + free(state); > + > + ufds[0].fd = fd; > + ufds[0].events = POLLERR | POLLPRI; > + ret = poll(ufds, 1, -1); > + if (ret < 0) { > + mpsslog("%s: poll failed %s\n", > + mic->name, strerror(errno)); > + goto close_error1; > + } > + } while (1); > +close_error: > + free(state); > +close_error1: > + close(fd); > +error: > + init_mic(mic); > + pthread_exit(NULL); > +} > + > +static void > +set_cmdline(struct mic_info *mic) > +{ > + char buffer[PATH_MAX]; > + int len; > + > + len = snprintf(buffer, PATH_MAX, > + "clocksource=tsc highres=off nohz=off "); > + len += snprintf(buffer + len, PATH_MAX, > + "cpufreq_on;corec6_off;pc3_off;pc6_off "); > + len += snprintf(buffer + len, PATH_MAX, > + "ifcfg=static;address,172.31.%d.1;netmask,255.255.255.0", > + mic->id); > + > + setsysfs(mic->name, "cmdline", buffer); > + mpsslog("%s: Command line: \"%s\"\n", mic->name, buffer); > + snprintf(buffer, PATH_MAX, "172.31.%d.1", mic->id); > + mpsslog("%s: IPADDR: \"%s\"\n", mic->name, buffer); > +} > + > +static void > +set_log_buf_info(struct mic_info *mic) > +{ > + int fd; > + off_t len; > + char system_map[] = "/lib/firmware/mic/System.map"; > + char *map, *temp, log_buf[17] = {'\0'}; > + > + fd = open(system_map, O_RDONLY); > + if (fd < 0) { > + mpsslog("%s: Opening System.map failed: %d\n", > + mic->name, errno); > + return; > + } > + len = lseek(fd, 0, SEEK_END); > + if (len < 0) { > + mpsslog("%s: Reading System.map size failed: %d\n", > + mic->name, errno); > + close(fd); > + return; > + } > + map = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0); > + if (map == MAP_FAILED) { > + mpsslog("%s: mmap of System.map failed: %d\n", > + mic->name, errno); > + close(fd); > + return; > + } > + temp = strstr(map, "__log_buf"); > + if (!temp) { > + mpsslog("%s: __log_buf not found: %d\n", mic->name, errno); > + munmap(map, len); > + close(fd); > + return; > + } > + strncpy(log_buf, temp - 19, 16); > + setsysfs(mic->name, "log_buf_addr", log_buf); > + mpsslog("%s: log_buf_addr: %s\n", mic->name, log_buf); > + temp = strstr(map, "log_buf_len"); > + if (!temp) { > + mpsslog("%s: log_buf_len not found: %d\n", mic->name, errno); > + munmap(map, len); > + close(fd); > + return; > + } > + strncpy(log_buf, temp - 19, 16); > + setsysfs(mic->name, "log_buf_len", log_buf); > + mpsslog("%s: log_buf_len: %s\n", mic->name, log_buf); > + munmap(map, len); > + close(fd); > +} > + > +static void init_mic(struct mic_info *mic); > + > +static void > +change_virtblk_backend(int x, siginfo_t *siginfo, void *p) > +{ > + struct mic_info *mic; > + > + for (mic = mic_list.next; mic != NULL; mic = mic->next) > + mic->mic_virtblk.signaled = 1/* true */; > +} > + > +static void > +init_mic(struct mic_info *mic) > +{ > + struct sigaction ignore = { > + .sa_flags = 0, > + .sa_handler = SIG_IGN > + }; > + struct sigaction act = { > + .sa_flags = SA_SIGINFO, > + .sa_sigaction = change_virtblk_backend, > + }; > + char buffer[PATH_MAX]; > + int err; > + > + /* ignore SIGUSR1 for both process */ > + sigaction(SIGUSR1, &ignore, NULL); > + > + mic->pid = fork(); > + switch (mic->pid) { > + case 0: > + set_log_buf_info(mic); > + set_cmdline(mic); > + add_virtio_device(mic, &virtcons_dev_page.dd); > + add_virtio_device(mic, &virtnet_dev_page.dd); > + err = pthread_create(&mic->mic_console.console_thread, NULL, > + virtio_console, mic); > + if (err) > + mpsslog("%s virtcons pthread_create failed %s\n", > + mic->name, strerror(err)); > + /* > + * TODO: Debug why not adding this sleep results in the tap > + * interface not coming up during certain runs sporadically. > + */ Indeed. > + usleep(1000); > + err = pthread_create(&mic->mic_net.net_thread, NULL, > + virtio_net, mic); > + if (err) > + mpsslog("%s virtnet pthread_create failed %s\n", > + mic->name, strerror(err)); > + err = pthread_create(&mic->mic_virtblk.block_thread, NULL, > + virtio_block, mic); > + if (err) > + mpsslog("%s virtblk pthread_create failed %s\n", > + mic->name, strerror(err)); > + sigemptyset(&act.sa_mask); > + err = sigaction(SIGUSR1, &act, NULL); Confused. Who sends this SIGUSR1 here? > + if (err) > + mpsslog("%s sigaction SIGUSR1 failed %s\n", > + mic->name, strerror(errno)); > + while (1) > + sleep(60); > + case -1: > + mpsslog("fork failed MIC name %s id %d errno %d\n", > + mic->name, mic->id, errno); > + break; > + default: > + if (mic->restart) { > + snprintf(buffer, PATH_MAX, > + "boot:linux:mic/uos.img:mic/mic%d.image", > + mic->id); > + setsysfs(mic->name, "state", buffer); > + mpsslog("%s restarting mic %d\n", > + mic->name, mic->restart); > + mic->restart = 0; > + } > + pthread_create(&mic->config_thread, NULL, mic_config, mic); > + } > +} > + > +static void > +start_daemon(void) > +{ > + struct mic_info *mic; > + > + for (mic = mic_list.next; mic != NULL; mic = mic->next) > + init_mic(mic); > + > + while (1) > + sleep(60); > +} > + > +static int > +init_mic_list(void) > +{ > + struct mic_info *mic = &mic_list; > + struct dirent *file; > + DIR *dp; > + int cnt = 0; > + > + dp = opendir(MICSYSFSDIR); > + if (!dp) > + return 0; > + > + while ((file = readdir(dp)) != NULL) { > + if (!strncmp(file->d_name, "mic", 3)) { > + mic->next = malloc(sizeof(struct mic_info)); > + if (mic->next) { > + mic = mic->next; > + mic->next = NULL; > + memset(mic, 0, sizeof(struct mic_info)); > + mic->id = atoi(&file->d_name[3]); > + mic->name = malloc(strlen(file->d_name) + 16); > + if (mic->name) > + strcpy(mic->name, file->d_name); > + mpsslog("MIC name %s id %d\n", mic->name, > + mic->id); > + cnt++; > + } > + } > + } > + > + closedir(dp); > + return cnt; > +} > + > +void > +mpsslog(char *format, ...) > +{ > + va_list args; > + char buffer[4096]; > + time_t t; > + char *ts; > + > + if (logfp == NULL) > + return; > + > + va_start(args, format); > + vsprintf(buffer, format, args); > + va_end(args); > + > + time(&t); > + ts = ctime(&t); > + ts[strlen(ts) - 1] = '\0'; > + fprintf(logfp, "%s: %s", ts, buffer); > + > + fflush(logfp); > +} > + > +int > +main(int argc, char *argv[]) > +{ > + int cnt; > + > + myname = argv[0]; > + > + logfp = fopen(LOGFILE_NAME, "a+"); > + if (!logfp) { > + fprintf(stderr, "cannot open logfile '%s'\n", LOGFILE_NAME); > + exit(1); > + } > + > + mpsslog("MIC Daemon start\n"); > + > + cnt = init_mic_list(); > + if (cnt == 0) { > + mpsslog("MIC module not loaded\n"); > + exit(2); > + } > + mpsslog("MIC found %d devices\n", cnt); > + > + start_daemon(); > + > + exit(0); > +} > diff --git a/Documentation/mic/mpssd/mpssd.h b/Documentation/mic/mpssd/mpssd.h > new file mode 100644 > index 0000000..b6dee38 > --- /dev/null > +++ b/Documentation/mic/mpssd/mpssd.h > @@ -0,0 +1,100 @@ > +/* > + * Intel MIC Platform Software Stack (MPSS) > + * > + * Copyright(c) 2013 Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License, version 2, as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Intel MIC User Space Tools. > + */ > +#ifndef _MPSSD_H_ > +#define _MPSSD_H_ > + > +#include <stdio.h> > +#include <stdlib.h> > +#include <string.h> > +#include <fcntl.h> > +#include <unistd.h> > +#include <dirent.h> > +#include <libgen.h> > +#include <pthread.h> > +#include <stdarg.h> > +#include <time.h> > +#include <errno.h> > +#include <sys/dir.h> > +#include <sys/ioctl.h> > +#include <sys/poll.h> > +#include <sys/types.h> > +#include <sys/socket.h> > +#include <sys/stat.h> > +#include <sys/types.h> > +#include <sys/mman.h> > +#include <sys/utsname.h> > +#include <sys/wait.h> > +#include <netinet/in.h> > +#include <arpa/inet.h> > +#include <netdb.h> > +#include <pthread.h> > +#include <signal.h> > +#include <limits.h> > +#include <syslog.h> > +#include <getopt.h> > +#include <net/if.h> > +#include <linux/if_tun.h> > +#include <linux/if_tun.h> > +#include <linux/virtio_ids.h> > + > +#define MICSYSFSDIR "/sys/class/mic" > +#define LOGFILE_NAME "/var/log/mpssd" > +#define PAGE_SIZE 4096 > + > +struct mic_console_info { > + pthread_t console_thread; > + int virtio_console_fd; > + void *console_dp; > +}; > + > +struct mic_net_info { > + pthread_t net_thread; > + int virtio_net_fd; > + int tap_fd; > + void *net_dp; > +}; > + > +struct mic_virtblk_info { > + pthread_t block_thread; > + int virtio_block_fd; > + void *block_dp; > + volatile sig_atomic_t signaled; > + char *backend_file; > + int backend; > + void *backend_addr; > + long backend_size; > +}; > + > +struct mic_info { > + int id; > + char *name; > + pthread_t config_thread; > + pid_t pid; > + struct mic_console_info mic_console; > + struct mic_net_info mic_net; > + struct mic_virtblk_info mic_virtblk; > + int restart; > + struct mic_info *next; > +}; > + > +void mpsslog(char *format, ...); > +char *readsysfs(char *dir, char *entry); > +int setsysfs(char *dir, char *entry, char *value); > +#endif > diff --git a/Documentation/mic/mpssd/sysfs.c b/Documentation/mic/mpssd/sysfs.c > new file mode 100644 > index 0000000..3244dcf > --- /dev/null > +++ b/Documentation/mic/mpssd/sysfs.c > @@ -0,0 +1,103 @@ > +/* > + * Intel MIC Platform Software Stack (MPSS) > + * > + * Copyright(c) 2013 Intel Corporation. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License, version 2, as > + * published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * The full GNU General Public License is included in this distribution in > + * the file called "COPYING". > + * > + * Intel MIC User Space Tools. > + */ > + > +#include "mpssd.h" > + > +#define PAGE_SIZE 4096 > + > +char * > +readsysfs(char *dir, char *entry) > +{ > + char filename[PATH_MAX]; > + char value[PAGE_SIZE]; > + char *string = NULL; > + int fd; > + int len; > + > + if (dir == NULL) > + snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry); > + else > + snprintf(filename, PATH_MAX, > + "%s/%s/%s", MICSYSFSDIR, dir, entry); > + > + fd = open(filename, O_RDONLY); > + if (fd < 0) { > + mpsslog("Failed to open sysfs entry '%s': %s\n", > + filename, strerror(errno)); > + return NULL; > + } > + > + len = read(fd, value, sizeof(value)); > + if (len < 0) { > + mpsslog("Failed to read sysfs entry '%s': %s\n", > + filename, strerror(errno)); > + goto readsys_ret; > + } > + > + value[len] = '\0'; Why are you careful to put this \0 here but not in setsysfs below? If you do, I'd fail on len == sizeof value as well, it isn't going to work with that. > + > + string = malloc(strlen(value) + 1); > + if (string) > + strcpy(string, value); > + > +readsys_ret: > + close(fd); > + return string; > +} > + > +int > +setsysfs(char *dir, char *entry, char *value) > +{ > + char filename[PATH_MAX]; > + char oldvalue[PAGE_SIZE]; > + int fd; > + > + if (dir == NULL) > + snprintf(filename, PATH_MAX, "%s/%s", MICSYSFSDIR, entry); > + else > + snprintf(filename, PATH_MAX, "%s/%s/%s", > + MICSYSFSDIR, dir, entry); > + > + fd = open(filename, O_RDWR); > + if (fd < 0) { > + mpsslog("Failed to open sysfs entry '%s': %s\n", > + filename, strerror(errno)); > + return errno; > + } > + > + if (read(fd, oldvalue, sizeof(oldvalue)) < 0) { > + mpsslog("Failed to read sysfs entry '%s': %s\n", > + filename, strerror(errno)); > + close(fd); > + return errno; > + } > + > + if (strcmp(value, oldvalue)) { > + if (write(fd, value, strlen(value)) < 0) { > + mpsslog("Failed to write new sysfs entry '%s': %s\n", > + filename, strerror(errno)); > + close(fd); > + return errno; > + } > + } > + > + close(fd); > + return 0; > +} > -- > 1.8.2.1 _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization