2021년 8월 19일 (목) 오후 5:48, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>님이 작성: > > On 2021-08-19 08:50:27 [+0900], Jeaho Hwang wrote: > > Without RT, udc_irq runs as a forced threaded irq handler, so it runs > > without any interruption or preemption. NO similar case is found on > > non-RT. > > I see only a devm_request_irq() so no force-threading here. Booting with > threadirqs would not lead to the problem since commit > 81e2073c175b8 ("genirq: Disable interrupts for force threaded handlers") > I was wrong. udc threaded irq handler allows twd interrupt even on non-RT and with threaded irq. I believed Chen's comment "The function hw_ep_prime is only called at udc_irq which is registered as top-half irq handlers. Why the timer interrupt is occurred when hw_ep_prime is executing?". We have additional experiments and got the results like below. RNDIS host was Windows. RT, 1ms delay between first ENDPTSETUPSTAT read and priming : error case occurred RT, 1ms delay + irq_save : no error case occurred. non-RT, threaded irq, 1ms delay : no error case occurred even twd fires inside the function execution. It doesn't seem to be a timing issue. But irq definitely affects priming on the RT kernel. Do you RT experts have any idea about the causes? If isr_tr_complete_handler fails ep priming it calls _ep_set_halt and goes an infinite loop in hw_ep_set_halt. It was an actual problem we experienced. So we protect irqs inside hw_ep_priming not to make error cases and also add a timeout inside the hw_ep_set_halt loop for a walkaround. The timeout patch is submitted to linux-usb. ( https://marc.info/?l=linux-usb&m=162918269024007&w=2 ) We withdrew this patch since we don't know if disabling irq is the best solution to solve the problem and udc would work fine with hw_ep_set_halt walkaround even though hw_ep_prime fails. But we are still trying to find out the cause of this symptom so We'd so appreciate it if RT or USB experts share some ideas or ways to report somewhere. Xilinx doesn't provide any support without their official kernel :( Thanks for the discussion Sebastian. Jeaho Hwang. > … > > > If this function here is sensitive to timing (say the cpu_relax() loop > > > gets interrupt for 1ms) then it has to be documented as such. > > > > The controller sets ENDPTSETUPSTAT register if the host sent a setup packet. > > yes it is a timing problem. I will document that and resubmit again if > > you agree that local_irq_save could help from the timing problem. > > > > Thanks for the advice. > > If it is really a timing issue in the function as you describe below > then disabling interrupts would help and it is indeed an RT only issue. > > So you read OP_ENDPTSETUPSTAT, it is 0, all good. > You write OP_ENDPTPRIME, wait for it to be cleared. > Then you read OP_ENDPTSETUPSTAT again and if it is 0, all good. > > And the TWD interrupt could delay say the second read would read 1 and > it is invalidated. Which looks odd. > However, it is "okay" if the TWD interrupt happens after the second > read? Even if the host sends a setup packet, nothing breaks? > Do you have numbers on how long irq-off section is here? It seems to > depend on how long the HW needs to clear the OP_ENDPTPRIME bits. > > Sebastian -- 황재호, Jay Hwang, linux team manager of RTst 010-7242-1593