On Mon, Apr 30, 2018 at 11:28:44AM -0700, Greg KH wrote: > On Mon, Apr 30, 2018 at 05:54:08PM +0000, Sasha Levin wrote: > > On Mon, Apr 30, 2018 at 08:06:34AM -0700, Greg KH wrote: > > >On Mon, Apr 30, 2018 at 02:39:29PM +0000, Sasha Levin wrote: > > >> On Mon, Apr 30, 2018 at 06:02:19AM -0700, Greg KH wrote: > > >> >On Fri, Apr 27, 2018 at 02:00:53AM +0000, Sasha Levin wrote: > > >> >> Hi Greg, > > >> >> > > >> >> Pleae pull commits for Linux 4.14 . > > >> >> > > >> >> I've sent a review request for all commits over a week ago and all > > >> >> comments were addressed. > > >> >> > > >> >> > > >> >> Thanks, > > >> >> Sasha > > >> >> > > >> >> ===== > > >> >> > > >> >> > > >> >> The following changes since commit d6949f48093c2d862d9bc39a7a89f2825c55edc4: > > >> >> > > >> >> Linux 4.14.36 (2018-04-24 09:36:40 +0200) > > >> >> > > >> >> are available in the Git repository at: > > >> >> > > >> >> git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.14-26042018 > > >> > > > >> >Did you really regenerate this tree? It has a load of patches that are > > >> >already upstream. I'm getting a bunch of conflicts when first trying to > > >> >merge, and then rebase it. > > >> > > >> Yes, it was based on 4.14.36 which was released two days before I sent > > >> this pull request. > > >> > > >> >Can you verify this is the correct tag? > > >> > > >> The conflicts seem to happen because you're venturing into the > > >> non-stable-tagged land. Are you now looking at non-stable-tagged patches > > >> as well? > > >> > > >> See for example: > > >> > > >> commit 43de32cdf0f4f73519e2df12fb93adc24f9746cb > > >> Author: Andreas Kemnade <andreas@xxxxxxxxxxxx> > > >> Date: Tue Feb 20 07:30:10 2018 -0600 > > >> > > >> usb: musb: fix enumeration after resume > > >> > > >> If that's the case, we'll need to find a new way to sync up, otherwise > > >> this will keep happening. > > > > > >No, something went wrong here, as the other tags you sent me all look > > >fine. > > > > > >For example, let's look at the first commit in this tree: > > > > > >--------- > > >Date: Fri, 3 Nov 2017 20:28:57 +0900 > > >From: Hector Martin <marcan@xxxxxxxxx> > > >Subject: [PATCH 001/755] firewire-ohci: work around oversized DMA reads on JMicron controllers > > > > > >[ Upstream commit 188775181bc05f29372b305ef96485840e351fde ] > > >--------- > > > > > >It shows up here in this branch as commit > > >62080d9352e274275ccb090af366d339bbad08a7, yet, if you look in my tree, > > >it is really commit 4a5d70332d57bd473ffe76d8777a2e9f847c7863 > > > > > >So now I have duplicates in here, which is what I thought I said was > > >happening last time I pulled :( > > > > > >The other 4 requests were all fine, can you redo this one please? > > > > Oh I see, it looks like more commits from my branch were merged after > > the last time we had this talk, and I ended up with different commit IDs > > because I rebased the whole thing on a newer stable tag. > > > > I've pushed a fixed tag as: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.14-30042018 > > Much nicer, thanks for that. > > Let me go review these now... Here's the mbox of the patches I have dropped from this pull request. Most of them are identical to the ones dropped from 4.16 and some were dropped based on being asked from the developers when I committed them. A few others were dropped because of my review. thanks, greg k-h
>From 7236a65b187f09de789e1beec2fac47125e3342a Mon Sep 17 00:00:00 2001 From: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Date: Thu, 8 Mar 2018 13:54:40 +1100 Subject: [PATCH 439/592] powerpc/pseries: Make plpar_wrappers.h safe to include when PSERIES=n Content-Length: 1087 Lines: 36 [ Upstream commit 5017e875e497c00dbc17558161fec3ff30b2b4a9 ] Currently plpar_wrappers.h is not safe to include when CONFIG_PPC_PSERIES=n, or at least it can be depending on other config options and so on. Fix that by wrapping the entire content in an ifdef. Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- arch/powerpc/include/asm/plpar_wrappers.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h index 55eddf50d149..540785d01f96 100644 --- a/arch/powerpc/include/asm/plpar_wrappers.h +++ b/arch/powerpc/include/asm/plpar_wrappers.h @@ -2,6 +2,8 @@ #ifndef _ASM_POWERPC_PLPAR_WRAPPERS_H #define _ASM_POWERPC_PLPAR_WRAPPERS_H +#ifdef CONFIG_PPC_PSERIES + #include <linux/string.h> #include <linux/irqflags.h> @@ -340,4 +342,6 @@ static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p) return rc; } +#endif /* CONFIG_PPC_PSERIES */ + #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */ -- 2.17.0 >From 0b35cedc32e9cd050926f31016577362786d8e4a Mon Sep 17 00:00:00 2001 From: Davidlohr Bueso <dave@xxxxxxxxxxxx> Date: Tue, 10 Apr 2018 16:35:26 -0700 Subject: [PATCH 321/592] ipc/sem: introduce semctl(SEM_STAT_ANY) Content-Length: 4689 Lines: 134 [ Upstream commit a280d6dc77eb6002f269d58cd47c7c7e69b617b6 ] There is a permission discrepancy when consulting shm ipc object metadata between /proc/sysvipc/sem (0444) and the SEM_STAT semctl command. The later does permission checks for the object vs S_IRUGO. As such there can be cases where EACCESS is returned via syscall but the info is displayed anyways in the procfs files. While this might have security implications via info leaking (albeit no writing to the sma metadata), this behavior goes way back and showing all the objects regardless of the permissions was most likely an overlook - so we are stuck with it. Furthermore, modifying either the syscall or the procfs file can cause userspace programs to break (ie ipcs). Some applications require getting the procfs info (without root privileges) and can be rather slow in comparison with a syscall -- up to 500x in some reported cases for shm. This patch introduces a new SEM_STAT_ANY command such that the sem ipc object permissions are ignored, and only audited instead. In addition, I've left the lsm security hook checks in place, as if some policy can block the call, then the user has no other choice than just parsing the procfs file. Link: http://lkml.kernel.org/r/20180215162458.10059-3-dave@xxxxxxxxxxxx Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx> Reported-by: Robert Kettler <robert.kettler@xxxxxxxxxxx> Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> Cc: Kees Cook <keescook@xxxxxxxxxxxx> Cc: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx> Cc: Michael Kerrisk <mtk.manpages@xxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- include/uapi/linux/sem.h | 1 + ipc/sem.c | 17 ++++++++++++----- security/selinux/hooks.c | 1 + security/smack/smack_lsm.c | 1 + 4 files changed, 15 insertions(+), 5 deletions(-) diff --git a/include/uapi/linux/sem.h b/include/uapi/linux/sem.h index 9c3e745b0656..39a1876f039e 100644 --- a/include/uapi/linux/sem.h +++ b/include/uapi/linux/sem.h @@ -19,6 +19,7 @@ /* ipcs ctl cmds */ #define SEM_STAT 18 #define SEM_INFO 19 +#define SEM_STAT_ANY 20 /* Obsolete, used only for backwards compatibility and libc5 compiles */ struct semid_ds { diff --git a/ipc/sem.c b/ipc/sem.c index b2698ebdcb31..046dc63075f5 100644 --- a/ipc/sem.c +++ b/ipc/sem.c @@ -1189,14 +1189,14 @@ static int semctl_stat(struct ipc_namespace *ns, int semid, memset(semid64, 0, sizeof(*semid64)); rcu_read_lock(); - if (cmd == SEM_STAT) { + if (cmd == SEM_STAT || cmd == SEM_STAT_ANY) { sma = sem_obtain_object(ns, semid); if (IS_ERR(sma)) { err = PTR_ERR(sma); goto out_unlock; } id = sma->sem_perm.id; - } else { + } else { /* IPC_STAT */ sma = sem_obtain_object_check(ns, semid); if (IS_ERR(sma)) { err = PTR_ERR(sma); @@ -1204,9 +1204,14 @@ static int semctl_stat(struct ipc_namespace *ns, int semid, } } - err = -EACCES; - if (ipcperms(ns, &sma->sem_perm, S_IRUGO)) - goto out_unlock; + /* see comment for SHM_STAT_ANY */ + if (cmd == SEM_STAT_ANY) + audit_ipc_obj(&sma->sem_perm); + else { + err = -EACCES; + if (ipcperms(ns, &sma->sem_perm, S_IRUGO)) + goto out_unlock; + } err = security_sem_semctl(sma, cmd); if (err) @@ -1585,6 +1590,7 @@ SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, unsigned long, arg) return semctl_info(ns, semid, cmd, p); case IPC_STAT: case SEM_STAT: + case SEM_STAT_ANY: err = semctl_stat(ns, semid, cmd, &semid64); if (err < 0) return err; @@ -1686,6 +1692,7 @@ COMPAT_SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, int, arg) return semctl_info(ns, semid, cmd, p); case IPC_STAT: case SEM_STAT: + case SEM_STAT_ANY: err = semctl_stat(ns, semid, cmd, &semid64); if (err < 0) return err; diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index f5d304736852..d61fd83d6dcd 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -5837,6 +5837,7 @@ static int selinux_sem_semctl(struct sem_array *sma, int cmd) break; case IPC_STAT: case SEM_STAT: + case SEM_STAT_ANY: perms = SEM__GETATTR | SEM__ASSOCIATE; break; default: diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 286171a16ed2..07f705ecf3bd 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -3162,6 +3162,7 @@ static int smack_sem_semctl(struct sem_array *sma, int cmd) case GETALL: case IPC_STAT: case SEM_STAT: + case SEM_STAT_ANY: may = MAY_READ; break; case SETVAL: -- 2.17.0 >From f847fbdf44eab60563231b28bde2e325f6e91053 Mon Sep 17 00:00:00 2001 From: Gregory CLEMENT <gregory.clement@xxxxxxxxxxx> Date: Tue, 27 Feb 2018 18:04:25 +0100 Subject: [PATCH 225/592] mailmap: Update email address for Gregory CLEMENT Content-Length: 767 Lines: 24 [ Upstream commit c535d632aecc6359d072374675a7787cbe71773b ] As now Free Electrons is Bootlin. Signed-off-by: Gregory CLEMENT <gregory.clement@xxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index c021f29779a7..9da7695c5557 100644 --- a/.mailmap +++ b/.mailmap @@ -62,6 +62,7 @@ Frank Zago <fzago@xxxxxxxxxxxxxxxxxxxxx> Greg Kroah-Hartman <greg@echidna.(none)> Greg Kroah-Hartman <gregkh@xxxxxxx> Greg Kroah-Hartman <greg@xxxxxxxxx> +Gregory CLEMENT <gregory.clement@xxxxxxxxxxx> <gregory.clement@xxxxxxxxxxxxxxxxxx> Henk Vergonet <Henk.Vergonet@xxxxxxxxx> Henrik Kretzschmar <henne@xxxxxxxxxxxxxxxx> Henrik Rydberg <rydberg@xxxxxxxxxxx> -- 2.17.0 >From 70f18d4540b2b702e602d8394a77e87f6f88f766 Mon Sep 17 00:00:00 2001 From: Sinan Kaya <okaya@xxxxxxxxxxxxxx> Date: Sun, 25 Mar 2018 10:39:19 -0400 Subject: [PATCH 388/592] net: qlge: Eliminate duplicate barriers on weakly-ordered archs Content-Length: 2481 Lines: 62 [ Upstream commit e42d8cee343a545ac2d9557a3b28708bbca2bd31 ] Code includes wmb() followed by writel(). writel() already has a barrier on some architectures like arm64. This ends up CPU observing two barriers back to back before executing the register write. Create a new wrapper function with relaxed write operator. Use the new wrapper when a write is following a wmb(). Signed-off-by: Sinan Kaya <okaya@xxxxxxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- drivers/net/ethernet/qlogic/qlge/qlge.h | 16 ++++++++++++++++ drivers/net/ethernet/qlogic/qlge/qlge_main.c | 3 ++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/qlogic/qlge/qlge.h b/drivers/net/ethernet/qlogic/qlge/qlge.h index 84ac50f92c9c..3e71b65a9546 100644 --- a/drivers/net/ethernet/qlogic/qlge/qlge.h +++ b/drivers/net/ethernet/qlogic/qlge/qlge.h @@ -2184,6 +2184,22 @@ static inline void ql_write_db_reg(u32 val, void __iomem *addr) mmiowb(); } +/* + * Doorbell Registers: + * Doorbell registers are virtual registers in the PCI memory space. + * The space is allocated by the chip during PCI initialization. The + * device driver finds the doorbell address in BAR 3 in PCI config space. + * The registers are used to control outbound and inbound queues. For + * example, the producer index for an outbound queue. Each queue uses + * 1 4k chunk of memory. The lower half of the space is for outbound + * queues. The upper half is for inbound queues. + * Caller has to guarantee ordering. + */ +static inline void ql_write_db_reg_relaxed(u32 val, void __iomem *addr) +{ + writel_relaxed(val, addr); +} + /* * Shadow Registers: * Outbound queues have a consumer index that is maintained by the chip. diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c index 9feec7009443..a153bcb8c739 100644 --- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c @@ -2702,7 +2702,8 @@ static netdev_tx_t qlge_send(struct sk_buff *skb, struct net_device *ndev) tx_ring->prod_idx = 0; wmb(); - ql_write_db_reg(tx_ring->prod_idx, tx_ring->prod_idx_db_reg); + ql_write_db_reg_relaxed(tx_ring->prod_idx, tx_ring->prod_idx_db_reg); + mmiowb(); netif_printk(qdev, tx_queued, KERN_DEBUG, qdev->ndev, "tx queued, slot %d, len %d\n", tx_ring->prod_idx, skb->len); -- 2.17.0 >From b03eb4ad6e8aba9e2c35aa8f4625897c5134925a Mon Sep 17 00:00:00 2001 From: David Ahern <dsahern@xxxxxxxxx> Date: Tue, 13 Feb 2018 08:44:06 -0800 Subject: [PATCH 577/592] selftests: Add FIB onlink tests Status: RO Content-Length: 10200 Lines: 396 [ Upstream commit 153e1b84f477f716bc3f81e6cfae1a3d941fc7ec ] Add test cases verifying FIB onlink commands work as expected in various conditions - IPv4, IPv6, main table, and VRF. Signed-off-by: David Ahern <dsahern@xxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- .../testing/selftests/net/fib-onlink-tests.sh | 375 ++++++++++++++++++ 1 file changed, 375 insertions(+) create mode 100755 tools/testing/selftests/net/fib-onlink-tests.sh diff --git a/tools/testing/selftests/net/fib-onlink-tests.sh b/tools/testing/selftests/net/fib-onlink-tests.sh new file mode 100755 index 000000000000..06b1d7cc12cc --- /dev/null +++ b/tools/testing/selftests/net/fib-onlink-tests.sh @@ -0,0 +1,375 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# IPv4 and IPv6 onlink tests + +PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no} + +# Network interfaces +# - odd in current namespace; even in peer ns +declare -A NETIFS +# default VRF +NETIFS[p1]=veth1 +NETIFS[p2]=veth2 +NETIFS[p3]=veth3 +NETIFS[p4]=veth4 +# VRF +NETIFS[p5]=veth5 +NETIFS[p6]=veth6 +NETIFS[p7]=veth7 +NETIFS[p8]=veth8 + +# /24 network +declare -A V4ADDRS +V4ADDRS[p1]=169.254.1.1 +V4ADDRS[p2]=169.254.1.2 +V4ADDRS[p3]=169.254.3.1 +V4ADDRS[p4]=169.254.3.2 +V4ADDRS[p5]=169.254.5.1 +V4ADDRS[p6]=169.254.5.2 +V4ADDRS[p7]=169.254.7.1 +V4ADDRS[p8]=169.254.7.2 + +# /64 network +declare -A V6ADDRS +V6ADDRS[p1]=2001:db8:101::1 +V6ADDRS[p2]=2001:db8:101::2 +V6ADDRS[p3]=2001:db8:301::1 +V6ADDRS[p4]=2001:db8:301::2 +V6ADDRS[p5]=2001:db8:501::1 +V6ADDRS[p6]=2001:db8:501::2 +V6ADDRS[p7]=2001:db8:701::1 +V6ADDRS[p8]=2001:db8:701::2 + +# Test networks: +# [1] = default table +# [2] = VRF +# +# /32 host routes +declare -A TEST_NET4 +TEST_NET4[1]=169.254.101 +TEST_NET4[2]=169.254.102 +# /128 host routes +declare -A TEST_NET6 +TEST_NET6[1]=2001:db8:101 +TEST_NET6[2]=2001:db8:102 + +# connected gateway +CONGW[1]=169.254.1.254 +CONGW[2]=169.254.5.254 + +# recursive gateway +RECGW4[1]=169.254.11.254 +RECGW4[2]=169.254.12.254 +RECGW6[1]=2001:db8:11::64 +RECGW6[2]=2001:db8:12::64 + +# for v4 mapped to v6 +declare -A TEST_NET4IN6IN6 +TEST_NET4IN6[1]=10.1.1.254 +TEST_NET4IN6[2]=10.2.1.254 + +# mcast address +MCAST6=ff02::1 + + +PEER_NS=bart +PEER_CMD="ip netns exec ${PEER_NS}" +VRF=lisa +VRF_TABLE=1101 +PBR_TABLE=101 + +################################################################################ +# utilities + +log_test() +{ + local rc=$1 + local expected=$2 + local msg="$3" + + if [ ${rc} -eq ${expected} ]; then + nsuccess=$((nsuccess+1)) + printf "\n TEST: %-50s [ OK ]\n" "${msg}" + else + nfail=$((nfail+1)) + printf "\n TEST: %-50s [FAIL]\n" "${msg}" + if [ "${PAUSE_ON_FAIL}" = "yes" ]; then + echo + echo "hit enter to continue, 'q' to quit" + read a + [ "$a" = "q" ] && exit 1 + fi + fi +} + +log_section() +{ + echo + echo "######################################################################" + echo "TEST SECTION: $*" + echo "######################################################################" +} + +log_subsection() +{ + echo + echo "#########################################" + echo "TEST SUBSECTION: $*" +} + +run_cmd() +{ + echo + echo "COMMAND: $*" + eval $* +} + +get_linklocal() +{ + local dev=$1 + local pfx + local addr + + addr=$(${pfx} ip -6 -br addr show dev ${dev} | \ + awk '{ + for (i = 3; i <= NF; ++i) { + if ($i ~ /^fe80/) + print $i + } + }' + ) + addr=${addr/\/*} + + [ -z "$addr" ] && return 1 + + echo $addr + + return 0 +} + +################################################################################ +# + +setup() +{ + echo + echo "########################################" + echo "Configuring interfaces" + + set -e + + # create namespace + ip netns add ${PEER_NS} + ip -netns ${PEER_NS} li set lo up + + # add vrf table + ip li add ${VRF} type vrf table ${VRF_TABLE} + ip li set ${VRF} up + ip ro add table ${VRF_TABLE} unreachable default + ip -6 ro add table ${VRF_TABLE} unreachable default + + # create test interfaces + ip li add ${NETIFS[p1]} type veth peer name ${NETIFS[p2]} + ip li add ${NETIFS[p3]} type veth peer name ${NETIFS[p4]} + ip li add ${NETIFS[p5]} type veth peer name ${NETIFS[p6]} + ip li add ${NETIFS[p7]} type veth peer name ${NETIFS[p8]} + + # enslave vrf interfaces + for n in 5 7; do + ip li set ${NETIFS[p${n}]} vrf ${VRF} + done + + # add addresses + for n in 1 3 5 7; do + ip li set ${NETIFS[p${n}]} up + ip addr add ${V4ADDRS[p${n}]}/24 dev ${NETIFS[p${n}]} + ip addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} + done + + # move peer interfaces to namespace and add addresses + for n in 2 4 6 8; do + ip li set ${NETIFS[p${n}]} netns ${PEER_NS} up + ip -netns ${PEER_NS} addr add ${V4ADDRS[p${n}]}/24 dev ${NETIFS[p${n}]} + ip -netns ${PEER_NS} addr add ${V6ADDRS[p${n}]}/64 dev ${NETIFS[p${n}]} + done + + set +e + + # let DAD complete - assume default of 1 probe + sleep 1 +} + +cleanup() +{ + # make sure we start from a clean slate + ip netns del ${PEER_NS} 2>/dev/null + for n in 1 3 5 7; do + ip link del ${NETIFS[p${n}]} 2>/dev/null + done + ip link del ${VRF} 2>/dev/null + ip ro flush table ${VRF_TABLE} + ip -6 ro flush table ${VRF_TABLE} +} + +################################################################################ +# IPv4 tests +# + +run_ip() +{ + local table="$1" + local prefix="$2" + local gw="$3" + local dev="$4" + local exp_rc="$5" + local desc="$6" + + # dev arg may be empty + [ -n "${dev}" ] && dev="dev ${dev}" + + run_cmd ip ro add table "${table}" "${prefix}"/32 via "${gw}" "${dev}" onlink + log_test $? ${exp_rc} "${desc}" +} + +valid_onlink_ipv4() +{ + # - unicast connected, unicast recursive + # + log_subsection "default VRF - main table" + + run_ip 254 ${TEST_NET4[1]}.1 ${CONGW[1]} ${NETIFS[p1]} 0 "unicast connected" + run_ip 254 ${TEST_NET4[1]}.2 ${RECGW4[1]} ${NETIFS[p1]} 0 "unicast recursive" + + log_subsection "VRF ${VRF}" + + run_ip ${VRF_TABLE} ${TEST_NET4[2]}.1 ${CONGW[2]} ${NETIFS[p5]} 0 "unicast connected" + run_ip ${VRF_TABLE} ${TEST_NET4[2]}.2 ${RECGW4[2]} ${NETIFS[p5]} 0 "unicast recursive" + + log_subsection "VRF device, PBR table" + + run_ip ${PBR_TABLE} ${TEST_NET4[2]}.3 ${CONGW[2]} ${NETIFS[p5]} 0 "unicast connected" + run_ip ${PBR_TABLE} ${TEST_NET4[2]}.4 ${RECGW4[2]} ${NETIFS[p5]} 0 "unicast recursive" +} + +invalid_onlink_ipv4() +{ + run_ip 254 ${TEST_NET4[1]}.11 ${V4ADDRS[p1]} ${NETIFS[p1]} 2 \ + "Invalid gw - local unicast address" + + run_ip ${VRF_TABLE} ${TEST_NET4[2]}.11 ${V4ADDRS[p5]} ${NETIFS[p5]} 2 \ + "Invalid gw - local unicast address, VRF" + + run_ip 254 ${TEST_NET4[1]}.101 ${V4ADDRS[p1]} "" 2 "No nexthop device given" + + run_ip 254 ${TEST_NET4[1]}.102 ${V4ADDRS[p3]} ${NETIFS[p1]} 2 \ + "Gateway resolves to wrong nexthop device" + + run_ip ${VRF_TABLE} ${TEST_NET4[2]}.103 ${V4ADDRS[p7]} ${NETIFS[p5]} 2 \ + "Gateway resolves to wrong nexthop device - VRF" +} + +################################################################################ +# IPv6 tests +# + +run_ip6() +{ + local table="$1" + local prefix="$2" + local gw="$3" + local dev="$4" + local exp_rc="$5" + local desc="$6" + + # dev arg may be empty + [ -n "${dev}" ] && dev="dev ${dev}" + + run_cmd ip -6 ro add table "${table}" "${prefix}"/128 via "${gw}" "${dev}" onlink + log_test $? ${exp_rc} "${desc}" +} + +valid_onlink_ipv6() +{ + # - unicast connected, unicast recursive, v4-mapped + # + log_subsection "default VRF - main table" + + run_ip6 254 ${TEST_NET6[1]}::1 ${V6ADDRS[p1]/::*}::64 ${NETIFS[p1]} 0 "unicast connected" + run_ip6 254 ${TEST_NET6[1]}::2 ${RECGW6[1]} ${NETIFS[p1]} 0 "unicast recursive" + run_ip6 254 ${TEST_NET6[1]}::3 ::ffff:${TEST_NET4IN6[1]} ${NETIFS[p1]} 0 "v4-mapped" + + log_subsection "VRF ${VRF}" + + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::1 ${V6ADDRS[p5]/::*}::64 ${NETIFS[p5]} 0 "unicast connected" + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::2 ${RECGW6[2]} ${NETIFS[p5]} 0 "unicast recursive" + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::3 ::ffff:${TEST_NET4IN6[2]} ${NETIFS[p5]} 0 "v4-mapped" + + log_subsection "VRF device, PBR table" + + run_ip6 ${PBR_TABLE} ${TEST_NET6[2]}::4 ${V6ADDRS[p5]/::*}::64 ${NETIFS[p5]} 0 "unicast connected" + run_ip6 ${PBR_TABLE} ${TEST_NET6[2]}::5 ${RECGW6[2]} ${NETIFS[p5]} 0 "unicast recursive" + run_ip6 ${PBR_TABLE} ${TEST_NET6[2]}::6 ::ffff:${TEST_NET4IN6[2]} ${NETIFS[p5]} 0 "v4-mapped" +} + +invalid_onlink_ipv6() +{ + local lladdr + + lladdr=$(get_linklocal ${NETIFS[p1]}) || return 1 + + run_ip6 254 ${TEST_NET6[1]}::11 ${V6ADDRS[p1]} ${NETIFS[p1]} 2 \ + "Invalid gw - local unicast address" + run_ip6 254 ${TEST_NET6[1]}::12 ${lladdr} ${NETIFS[p1]} 2 \ + "Invalid gw - local linklocal address" + run_ip6 254 ${TEST_NET6[1]}::12 ${MCAST6} ${NETIFS[p1]} 2 \ + "Invalid gw - multicast address" + + lladdr=$(get_linklocal ${NETIFS[p5]}) || return 1 + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::11 ${V6ADDRS[p5]} ${NETIFS[p5]} 2 \ + "Invalid gw - local unicast address, VRF" + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::12 ${lladdr} ${NETIFS[p5]} 2 \ + "Invalid gw - local linklocal address, VRF" + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::12 ${MCAST6} ${NETIFS[p5]} 2 \ + "Invalid gw - multicast address, VRF" + + run_ip6 254 ${TEST_NET6[1]}::101 ${V6ADDRS[p1]} "" 2 \ + "No nexthop device given" + + # default VRF validation is done against LOCAL table + # run_ip6 254 ${TEST_NET6[1]}::102 ${V6ADDRS[p3]/::[0-9]/::64} ${NETIFS[p1]} 2 \ + # "Gateway resolves to wrong nexthop device" + + run_ip6 ${VRF_TABLE} ${TEST_NET6[2]}::103 ${V6ADDRS[p7]/::[0-9]/::64} ${NETIFS[p5]} 2 \ + "Gateway resolves to wrong nexthop device - VRF" +} + +run_onlink_tests() +{ + log_section "IPv4 onlink" + log_subsection "Valid onlink commands" + valid_onlink_ipv4 + log_subsection "Invalid onlink commands" + invalid_onlink_ipv4 + + log_section "IPv6 onlink" + log_subsection "Valid onlink commands" + valid_onlink_ipv6 + invalid_onlink_ipv6 +} + +################################################################################ +# main + +nsuccess=0 +nfail=0 + +cleanup +setup +run_onlink_tests +cleanup + +if [ "$TESTS" != "none" ]; then + printf "\nTests passed: %3d\n" ${nsuccess} + printf "Tests failed: %3d\n" ${nfail} +fi -- 2.17.0 >From 885e86e252ff6a7c647d8ad4c8af78d5b605bda9 Mon Sep 17 00:00:00 2001 From: Shakeel Butt <shakeelb@xxxxxxxxxx> Date: Wed, 21 Feb 2018 14:45:28 -0800 Subject: [PATCH 078/592] mm, mlock, vmscan: no more skipping pagevecs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Status: RO Content-Length: 10692 Lines: 303 [ Upstream commit 9c4e6b1a7027f102990c0395296015a812525f4d ] When a thread mlocks an address space backed either by file pages which are currently not present in memory or swapped out anon pages (not in swapcache), a new page is allocated and added to the local pagevec (lru_add_pvec), I/O is triggered and the thread then sleeps on the page. On I/O completion, the thread can wake on a different CPU, the mlock syscall will then sets the PageMlocked() bit of the page but will not be able to put that page in unevictable LRU as the page is on the pagevec of a different CPU. Even on drain, that page will go to evictable LRU because the PageMlocked() bit is not checked on pagevec drain. The page will eventually go to right LRU on reclaim but the LRU stats will remain skewed for a long time. This patch puts all the pages, even unevictable, to the pagevecs and on the drain, the pages will be added on their LRUs correctly by checking their evictability. This resolves the mlocked pages on pagevec of other CPUs issue because when those pagevecs will be drained, the mlocked file pages will go to unevictable LRU. Also this makes the race with munlock easier to resolve because the pagevec drains happen in LRU lock. However there is still one place which makes a page evictable and does PageLRU check on that page without LRU lock and needs special attention. TestClearPageMlocked() and isolate_lru_page() in clear_page_mlock(). #0: __pagevec_lru_add_fn #1: clear_page_mlock SetPageLRU() if (!TestClearPageMlocked()) return smp_mb() // <--required // inside does PageLRU if (!PageMlocked()) if (isolate_lru_page()) move to evictable LRU putback_lru_page() else move to unevictable LRU In '#1', TestClearPageMlocked() provides full memory barrier semantics and thus the PageLRU check (inside isolate_lru_page) can not be reordered before it. In '#0', without explicit memory barrier, the PageMlocked() check can be reordered before SetPageLRU(). If that happens, '#0' can put a page in unevictable LRU and '#1' might have just cleared the Mlocked bit of that page but fails to isolate as PageLRU fails as '#0' still hasn't set PageLRU bit of that page. That page will be stranded on the unevictable LRU. There is one (good) side effect though. Without this patch, the pages allocated for System V shared memory segment are added to evictable LRUs even after shmctl(SHM_LOCK) on that segment. This patch will correctly put such pages to unevictable LRU. Link: http://lkml.kernel.org/r/20171121211241.18877-1-shakeelb@xxxxxxxxxx Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> Cc: Jérôme Glisse <jglisse@xxxxxxxxxx> Cc: Huang Ying <ying.huang@xxxxxxxxx> Cc: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Greg Thelen <gthelen@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Balbir Singh <bsingharora@xxxxxxxxx> Cc: Minchan Kim <minchan@xxxxxxxxxx> Cc: Shaohua Li <shli@xxxxxx> Cc: Jan Kara <jack@xxxxxxx> Cc: Nicholas Piggin <npiggin@xxxxxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- include/linux/swap.h | 2 -- mm/mlock.c | 6 ++++ mm/swap.c | 82 +++++++++++++++++++++++++------------------- mm/vmscan.c | 59 +------------------------------ 4 files changed, 54 insertions(+), 95 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index f02fb5db8914..9b31d04914eb 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -326,8 +326,6 @@ extern void deactivate_file_page(struct page *page); extern void mark_page_lazyfree(struct page *page); extern void swap_setup(void); -extern void add_page_to_unevictable_list(struct page *page); - extern void lru_cache_add_active_or_unevictable(struct page *page, struct vm_area_struct *vma); diff --git a/mm/mlock.c b/mm/mlock.c index 46af369c13e5..4cc261b22f68 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -64,6 +64,12 @@ void clear_page_mlock(struct page *page) mod_zone_page_state(page_zone(page), NR_MLOCK, -hpage_nr_pages(page)); count_vm_event(UNEVICTABLE_PGCLEARED); + /* + * The previous TestClearPageMlocked() corresponds to the smp_mb() + * in __pagevec_lru_add_fn(). + * + * See __pagevec_lru_add_fn for more explanation. + */ if (!isolate_lru_page(page)) { putback_lru_page(page); } else { diff --git a/mm/swap.c b/mm/swap.c index a77d68f2c1b6..ec21f4f906a1 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -445,30 +445,6 @@ void lru_cache_add(struct page *page) __lru_cache_add(page); } -/** - * add_page_to_unevictable_list - add a page to the unevictable list - * @page: the page to be added to the unevictable list - * - * Add page directly to its zone's unevictable list. To avoid races with - * tasks that might be making the page evictable, through eg. munlock, - * munmap or exit, while it's not on the lru, we want to add the page - * while it's locked or otherwise "invisible" to other tasks. This is - * difficult to do when using the pagevec cache, so bypass that. - */ -void add_page_to_unevictable_list(struct page *page) -{ - struct pglist_data *pgdat = page_pgdat(page); - struct lruvec *lruvec; - - spin_lock_irq(&pgdat->lru_lock); - lruvec = mem_cgroup_page_lruvec(page, pgdat); - ClearPageActive(page); - SetPageUnevictable(page); - SetPageLRU(page); - add_page_to_lru_list(page, lruvec, LRU_UNEVICTABLE); - spin_unlock_irq(&pgdat->lru_lock); -} - /** * lru_cache_add_active_or_unevictable * @page: the page to be added to LRU @@ -484,13 +460,9 @@ void lru_cache_add_active_or_unevictable(struct page *page, { VM_BUG_ON_PAGE(PageLRU(page), page); - if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED)) { + if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED)) SetPageActive(page); - lru_cache_add(page); - return; - } - - if (!TestSetPageMlocked(page)) { + else if (!TestSetPageMlocked(page)) { /* * We use the irq-unsafe __mod_zone_page_stat because this * counter is not modified from interrupt context, and the pte @@ -500,7 +472,7 @@ void lru_cache_add_active_or_unevictable(struct page *page, hpage_nr_pages(page)); count_vm_event(UNEVICTABLE_PGMLOCKED); } - add_page_to_unevictable_list(page); + lru_cache_add(page); } /* @@ -883,15 +855,55 @@ void lru_add_page_tail(struct page *page, struct page *page_tail, static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec, void *arg) { - int file = page_is_file_cache(page); - int active = PageActive(page); - enum lru_list lru = page_lru(page); + enum lru_list lru; + int was_unevictable = TestClearPageUnevictable(page); VM_BUG_ON_PAGE(PageLRU(page), page); SetPageLRU(page); + /* + * Page becomes evictable in two ways: + * 1) Within LRU lock [munlock_vma_pages() and __munlock_pagevec()]. + * 2) Before acquiring LRU lock to put the page to correct LRU and then + * a) do PageLRU check with lock [check_move_unevictable_pages] + * b) do PageLRU check before lock [clear_page_mlock] + * + * (1) & (2a) are ok as LRU lock will serialize them. For (2b), we need + * following strict ordering: + * + * #0: __pagevec_lru_add_fn #1: clear_page_mlock + * + * SetPageLRU() TestClearPageMlocked() + * smp_mb() // explicit ordering // above provides strict + * // ordering + * PageMlocked() PageLRU() + * + * + * if '#1' does not observe setting of PG_lru by '#0' and fails + * isolation, the explicit barrier will make sure that page_evictable + * check will put the page in correct LRU. Without smp_mb(), SetPageLRU + * can be reordered after PageMlocked check and can make '#1' to fail + * the isolation of the page whose Mlocked bit is cleared (#0 is also + * looking at the same page) and the evictable page will be stranded + * in an unevictable LRU. + */ + smp_mb(); + + if (page_evictable(page)) { + lru = page_lru(page); + update_page_reclaim_stat(lruvec, page_is_file_cache(page), + PageActive(page)); + if (was_unevictable) + count_vm_event(UNEVICTABLE_PGRESCUED); + } else { + lru = LRU_UNEVICTABLE; + ClearPageActive(page); + SetPageUnevictable(page); + if (!was_unevictable) + count_vm_event(UNEVICTABLE_PGCULLED); + } + add_page_to_lru_list(page, lruvec, lru); - update_page_reclaim_stat(lruvec, file, active); trace_mm_lru_insertion(page, lru); } diff --git a/mm/vmscan.c b/mm/vmscan.c index b3f5e337b64a..cbe86f127f5b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -790,64 +790,7 @@ int remove_mapping(struct address_space *mapping, struct page *page) */ void putback_lru_page(struct page *page) { - bool is_unevictable; - int was_unevictable = PageUnevictable(page); - - VM_BUG_ON_PAGE(PageLRU(page), page); - -redo: - ClearPageUnevictable(page); - - if (page_evictable(page)) { - /* - * For evictable pages, we can use the cache. - * In event of a race, worst case is we end up with an - * unevictable page on [in]active list. - * We know how to handle that. - */ - is_unevictable = false; - lru_cache_add(page); - } else { - /* - * Put unevictable pages directly on zone's unevictable - * list. - */ - is_unevictable = true; - add_page_to_unevictable_list(page); - /* - * When racing with an mlock or AS_UNEVICTABLE clearing - * (page is unlocked) make sure that if the other thread - * does not observe our setting of PG_lru and fails - * isolation/check_move_unevictable_pages, - * we see PG_mlocked/AS_UNEVICTABLE cleared below and move - * the page back to the evictable list. - * - * The other side is TestClearPageMlocked() or shmem_lock(). - */ - smp_mb(); - } - - /* - * page's status can change while we move it among lru. If an evictable - * page is on unevictable list, it never be freed. To avoid that, - * check after we added it to the list, again. - */ - if (is_unevictable && page_evictable(page)) { - if (!isolate_lru_page(page)) { - put_page(page); - goto redo; - } - /* This means someone else dropped this page from LRU - * So, it will be freed or putback to LRU again. There is - * nothing to do here. - */ - } - - if (was_unevictable && !is_unevictable) - count_vm_event(UNEVICTABLE_PGRESCUED); - else if (!was_unevictable && is_unevictable) - count_vm_event(UNEVICTABLE_PGCULLED); - + lru_cache_add(page); put_page(page); /* drop ref from isolate */ } -- 2.17.0 >From acfac307606b58de445794efceb40774b0c02994 Mon Sep 17 00:00:00 2001 From: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Date: Mon, 12 Feb 2018 13:18:38 -0800 Subject: [PATCH 095/592] fs/signalfd: fix build error for BUS_MCEERR_AR Status: RO Content-Length: 1650 Lines: 48 [ Upstream commit 9026e820cbd2ea39a06a129ecdddf2739bd3602b ] Fix build error in fs/signalfd.c by using same method that is used in kernel/signal.c: separate blocks for different signal si_code values. ./fs/signalfd.c: error: 'BUS_MCEERR_AR' undeclared (first use in this function) Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> Signed-off-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- fs/signalfd.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/signalfd.c b/fs/signalfd.c index 1c667af86da5..3a2d8ef83740 100644 --- a/fs/signalfd.c +++ b/fs/signalfd.c @@ -118,13 +118,22 @@ static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo, err |= __put_user(kinfo->si_trapno, &uinfo->ssi_trapno); #endif #ifdef BUS_MCEERR_AO - /* + /* + * Other callers might not initialize the si_lsb field, + * so check explicitly for the right codes here. + */ + if (kinfo->si_signo == SIGBUS && + kinfo->si_code == BUS_MCEERR_AO) + err |= __put_user((short) kinfo->si_addr_lsb, + &uinfo->ssi_addr_lsb); +#endif +#ifdef BUS_MCEERR_AR + /* * Other callers might not initialize the si_lsb field, * so check explicitly for the right codes here. */ if (kinfo->si_signo == SIGBUS && - (kinfo->si_code == BUS_MCEERR_AR || - kinfo->si_code == BUS_MCEERR_AO)) + kinfo->si_code == BUS_MCEERR_AR) err |= __put_user((short) kinfo->si_addr_lsb, &uinfo->ssi_addr_lsb); #endif -- 2.17.0 >From 33dc0748f24ff5590990ae9c8bd1bd6c837a112e Mon Sep 17 00:00:00 2001 From: David Lechner <david@xxxxxxxxxxxxxx> Date: Mon, 15 Jan 2018 11:29:31 -0600 Subject: [PATCH 542/592] ARM: davinci_all_defconfig: set CONFIG_DAVINCI_WATCHDOG=y Status: RO Content-Length: 1249 Lines: 33 [ Upstream commit 35ba26772c827dbfc03be8adc3af8ff0d294b38f ] This changes CONFIG_DAVINCI_WATCHDOG from a module to a compiled-in option. Since the reset function has been moved out of the mach code in commit 94f2e94514e5 ("ARM: davinci: remove watchdog reset") and into the watchdog driver, devices cannot reboot unless the watchdog driver is loaded, so make it a compiled-in option so that we can always reboot, even when modules are not loaded. Cc: Sekhar Nori <nsekhar@xxxxxx> Suggested-by: Adam Ford <aford173@xxxxxxxxx> Signed-off-by: David Lechner <david@xxxxxxxxxxxxxx> Signed-off-by: Sekhar Nori <nsekhar@xxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- arch/arm/configs/davinci_all_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/configs/davinci_all_defconfig b/arch/arm/configs/davinci_all_defconfig index 27d9720f7207..64b9f3c2524f 100644 --- a/arch/arm/configs/davinci_all_defconfig +++ b/arch/arm/configs/davinci_all_defconfig @@ -124,7 +124,7 @@ CONFIG_POWER_RESET=y CONFIG_POWER_RESET_GPIO=y CONFIG_BATTERY_LEGO_EV3=m CONFIG_WATCHDOG=y -CONFIG_DAVINCI_WATCHDOG=m +CONFIG_DAVINCI_WATCHDOG=y CONFIG_MFD_DM355EVM_MSP=y CONFIG_TPS6507X=y CONFIG_REGULATOR=y -- 2.17.0 >From cd0ec7a6623a37c2b81829cef83f08a1deb54df7 Mon Sep 17 00:00:00 2001 From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Date: Tue, 10 Apr 2018 16:30:23 -0700 Subject: [PATCH 320/592] ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y Status: RO Content-Length: 2995 Lines: 75 [ Upstream commit 3d2054ad8c2d5100b68b0c0405f89fd90bf4107b ] CMA area is now managed by the separate zone, ZONE_MOVABLE, to fix many MM related problems. In this implementation, if CONFIG_HIGHMEM = y, then ZONE_MOVABLE is considered as HIGHMEM and the memory of the CMA area is also considered as HIGHMEM. That means that they are considered as the page without direct mapping. However, CMA area could be in a lowmem and the memory could have direct mapping. In ARM, when establishing a new mapping for DMA, direct mapping should be cleared since two mapping with different cache policy could cause unknown problem. With this patch, PageHighmem() for the CMA memory located in lowmem returns true so that the function for DMA mapping cannot notice whether it needs to clear direct mapping or not, correctly. To handle this situation, this patch always clears direct mapping for such CMA memory. Link: http://lkml.kernel.org/r/1512114786-5085-4-git-send-email-iamjoonsoo.kim@xxxxxxx Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Tested-by: Tony Lindgren <tony@xxxxxxxxxxx> Cc: "Aneesh Kumar K . V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Laura Abbott <lauraa@xxxxxxxxxxxxxx> Cc: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Michal Nazarewicz <mina86@xxxxxxxxxx> Cc: Minchan Kim <minchan@xxxxxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Will Deacon <will.deacon@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- arch/arm/mm/dma-mapping.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index fcf1473d6fed..6fb3a3c063b7 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -481,6 +481,12 @@ void __init dma_contiguous_early_fixup(phys_addr_t base, unsigned long size) void __init dma_contiguous_remap(void) { int i; + + if (!dma_mmu_remap_num) + return; + + /* call flush_cache_all() since CMA area would be large enough */ + flush_cache_all(); for (i = 0; i < dma_mmu_remap_num; i++) { phys_addr_t start = dma_mmu_remap[i].base; phys_addr_t end = start + dma_mmu_remap[i].size; @@ -513,7 +519,15 @@ void __init dma_contiguous_remap(void) flush_tlb_kernel_range(__phys_to_virt(start), __phys_to_virt(end)); - iotable_init(&map, 1); + /* + * All the memory in CMA region will be on ZONE_MOVABLE. + * If that zone is considered as highmem, the memory in CMA + * region is also considered as highmem even if it's + * physical address belong to lowmem. In this case, + * re-mapping isn't required. + */ + if (!is_highmem_idx(ZONE_MOVABLE)) + iotable_init(&map, 1); } } -- 2.17.0 >From 429739b63557ca386a693af59ff30b993e791f88 Mon Sep 17 00:00:00 2001 From: Borislav Petkov <bp@xxxxxxx> Date: Wed, 21 Feb 2018 11:18:56 +0100 Subject: [PATCH 560/592] x86/mce/AMD: Collect error info even if valid bits are not set Status: RO Content-Length: 1774 Lines: 53 [ Upstream commit 4b1e84276a6172980c5bf39aa091ba13e90d6dad ] The MCA banks log error info into MCA_ADDR, MCA_MISC0, and MCA_SYND even if the corresponding valid bits are not set: "Error handlers should save the values in MCA_ADDR, MCA_MISC0, and MCA_SYND even if MCA_STATUS[AddrV], MCA_STATUS[MiscV], and MCA_STATUS[SyndV] are zero." Do so by setting those bits so that code down the MCE processing path doesn't need to be changed. Signed-off-by: Borislav Petkov <bp@xxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Tony Luck <tony.luck@xxxxxxxxx> Cc: linux-edac <linux-edac@xxxxxxxxxxxxxxx> Link: http://lkml.kernel.org/r/20180221101900.10326-5-bp@xxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- arch/x86/kernel/cpu/mcheck/mce.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 28d27de08545..12cd6b66053a 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -445,6 +445,20 @@ static inline void mce_gather_info(struct mce *m, struct pt_regs *regs) if (mca_cfg.rip_msr) m->ip = mce_rdmsrl(mca_cfg.rip_msr); } + + /* + * Error handlers should save the values in MCA_ADDR, MCA_MISC0, and + * MCA_SYND even if MCA_STATUS[AddrV], MCA_STATUS[MiscV], and + * MCA_STATUS[SyndV] are zero. + */ + if (m->cpuvendor == X86_VENDOR_AMD) { + u64 status = MCI_STATUS_ADDRV | MCI_STATUS_MISCV; + + if (mce_flags.smca) + status |= MCI_STATUS_SYNDV; + + m->status |= status; + } } int mce_available(struct cpuinfo_x86 *c) -- 2.17.0 >From dddf7a51377b2c7321ec698e74c73c5043f970c4 Mon Sep 17 00:00:00 2001 From: Rolf Evers-Fischer <rolf.evers.fischer@xxxxxxxxx> Date: Wed, 28 Feb 2018 18:32:19 +0100 Subject: [PATCH 508/592] PCI: endpoint: Fix kernel panic after put_device() Status: RO Content-Length: 1123 Lines: 33 [ Upstream commit 9eef6a5c3b0bf90eb292d462ea267bcb6ad1c334 ] 'put_device()' calls the relase function 'pci_epf_dev_release()', which already frees 'epf->name' and 'epf'. Therefore we must not free them again after 'put_device()'. Fixes: 5e8cb4033807 ("PCI: endpoint: Add EP core layer to enable EP controller and EP functions") Signed-off-by: Rolf Evers-Fischer <rolf.evers.fischer@xxxxxxxxx> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> Acked-by: Kishon Vijay Abraham I <kishon@xxxxxx> Reviewed-by: Andy Shevchenko <andy.shevchenko@xxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- drivers/pci/endpoint/pci-epf-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/endpoint/pci-epf-core.c b/drivers/pci/endpoint/pci-epf-core.c index ae1611a62808..14b5a42ee8dc 100644 --- a/drivers/pci/endpoint/pci-epf-core.c +++ b/drivers/pci/endpoint/pci-epf-core.c @@ -254,7 +254,7 @@ struct pci_epf *pci_epf_create(const char *name) put_dev: put_device(dev); - kfree(epf->name); + return ERR_PTR(ret); free_func_name: kfree(func_name); -- 2.17.0 >From 630a6d27e4bf3c8c9a87b635a7705162b59527cb Mon Sep 17 00:00:00 2001 From: Bart Van Assche <bart.vanassche@xxxxxxx> Date: Wed, 28 Feb 2018 10:15:33 -0800 Subject: [PATCH 516/592] block: Fix a race between request queue removal and the block cgroup controller Status: RO Content-Length: 3069 Lines: 95 [ Upstream commit a063057d7c731cffa7d10740e8ebc2970df8dbb3 ] Avoid that the following race can occur: blk_cleanup_queue() blkcg_print_blkgs() spin_lock_irq(lock) (1) spin_lock_irq(blkg->q->queue_lock) (2,5) q->queue_lock = &q->__queue_lock (3) spin_unlock_irq(lock) (4) spin_unlock_irq(blkg->q->queue_lock) (6) (1) take driver lock; (2) busy loop for driver lock; (3) override driver lock with internal lock; (4) unlock driver lock; (5) can take driver lock now; (6) but unlock internal lock. This change is safe because only the SCSI core and the NVME core keep a reference on a request queue after having called blk_cleanup_queue(). Neither driver accesses any of the removed data structures between its blk_cleanup_queue() and blk_put_queue() calls. Reported-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxx> Reviewed-by: Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> Cc: Jan Kara <jack@xxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- block/blk-core.c | 31 +++++++++++++++++++++++++++++++ block/blk-sysfs.c | 7 ------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 1feeb1a8aad9..bbaddd2f9b4a 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -681,6 +681,37 @@ void blk_cleanup_queue(struct request_queue *q) del_timer_sync(&q->backing_dev_info->laptop_mode_wb_timer); blk_sync_queue(q); + /* + * I/O scheduler exit is only safe after the sysfs scheduler attribute + * has been removed. + */ + WARN_ON_ONCE(q->kobj.state_in_sysfs); + + /* + * Since the I/O scheduler exit code may access cgroup information, + * perform I/O scheduler exit before disassociating from the block + * cgroup controller. + */ + if (q->elevator) { + ioc_clear_queue(q); + elevator_exit(q, q->elevator); + q->elevator = NULL; + } + + /* + * Remove all references to @q from the block cgroup controller before + * restoring @q->queue_lock to avoid that restoring this pointer causes + * e.g. blkcg_print_blkgs() to crash. + */ + blkcg_exit_queue(q); + + /* + * Since the cgroup code may dereference the @q->backing_dev_info + * pointer, only decrease its reference count after having removed the + * association with the block cgroup controller. + */ + bdi_put(q->backing_dev_info); + if (q->mq_ops) blk_mq_free_queue(q); percpu_ref_exit(&q->q_usage_counter); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index e54be402899d..920e382b3c59 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -801,13 +801,6 @@ static void __blk_release_queue(struct work_struct *work) if (test_bit(QUEUE_FLAG_POLL_STATS, &q->queue_flags)) blk_stat_remove_callback(q, q->poll_cb); blk_stat_free_callback(q->poll_cb); - bdi_put(q->backing_dev_info); - blkcg_exit_queue(q); - - if (q->elevator) { - ioc_clear_queue(q); - elevator_exit(q, q->elevator); - } blk_free_queue_stats(q->stats); -- 2.17.0 >From 7f458a63eda6cb9541af576f95e444f4db7b00f7 Mon Sep 17 00:00:00 2001 From: Emmanuel Grumbach <emmanuel.grumbach@xxxxxxxxx> Date: Mon, 26 Mar 2018 16:21:04 +0300 Subject: [PATCH 370/592] mac80211: don't WARN on bad WMM parameters from buggy APs Status: RO Content-Length: 1386 Lines: 38 [ Upstream commit c470bdc1aaf36669e04ba65faf1092b2d1c6cabe ] Apparently, some APs are buggy enough to send a zeroed WMM IE. Don't WARN on this since this is not caused by a bug on the client's system. This aligns the condition of the WARNING in drv_conf_tx with the validity check in ieee80211_sta_wmm_params. We will now pick the default values whenever we get a zeroed WMM IE. This has been reported here: https://bugzilla.kernel.org/show_bug.cgi?id=199161 Fixes: f409079bb678 ("mac80211: sanity check CW_min/CW_max towards driver") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@xxxxxxxxx> Signed-off-by: Johannes Berg <johannes.berg@xxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> --- net/mac80211/mlme.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 9115cc52ce83..7d57654086d1 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -1797,7 +1797,8 @@ static bool ieee80211_sta_wmm_params(struct ieee80211_local *local, params[ac].acm = acm; params[ac].uapsd = uapsd; - if (params[ac].cw_min > params[ac].cw_max) { + if (params->cw_min == 0 || + params[ac].cw_min > params[ac].cw_max) { sdata_info(sdata, "AP has invalid WMM params (CWmin/max=%d/%d for ACI %d), using defaults\n", params[ac].cw_min, params[ac].cw_max, aci); -- 2.17.0