DSP Multiprocessor Implementation for Future Wireless Base
Transcription
DSP Multiprocessor Implementation for Future Wireless Base
Real-Time DSP Multiprocessor Implementation for Future Wireless Base-Station Receivers Bryan Jones, Sridhar Rajagopal, and Dr. Joseph Cavallaro Wireless Information Applicance – RENE Home Area Wireless LAN High Speed Office Wireless LAN Outdoor CDMA Cellular Network Wireless Information Applicance – RENE Home Area Wireless LAN High Speed Office Wireless LAN Outdoor CDMA Cellular Network Wireless Information Appliance base station Challenges: Higher data noise rates Longer battery attenuation life (lower power signals) MAI multipath reflections fading Wireless Information Appliance Solution: Advanced DS-CDMA joint multiuser channel estimation and detection Fixed-point friendly Focus on baseband processing Real-world: Asynchronous Fading channel Performance includes both estimation and detection Outline Algorithms for joint estimation and detection Wireless testbed (Simulink + RealSync) Multiprocessor implementation Results and conclusions Algorithms – channel estimation As each bit arrives: Form cross- and auto-correlation matrices from windowed data (i) bb R T ( i −1) + b b L L bb =R Pilot bits or b Detected bits − b0b0T Rbr(i ) = Rbr( i −1)+ bL rLT − b0r0T … Chips from r … antenna Window index: 0 (newest) Rbb, Rbr update downdate L (oldest) Algorithms – channel estimation As each bit arrives: Rbb( i ) = Rbb(i −1) + b0b0T − bLbLT (i ) br R ( i −1) br =R +bb −b b T 0 0 Update channel estimate iteratively: −1 A = Rbb Rbr becomes ( i −1) A =A (i) ( i −1) − µ( A R −R ) (i) bb (i) br µ controls convergence behavior. A contains both amplitude and delay information for each user. T L L Algorithms – detection (CMF) Separate odd and even columns of channel estimate A0,1 ⇐ A Form initial estimate of users’ bits via code-matched filtering y0( 0 ) = A1T r1 + A0T r0 n Soft: n Hard: d 0( 0 ) = sign( y0( 0) ) Algorithms – detection (PIC) Form L, R, C matrices from channel estimate T L = A0T A1 R=L C = A1T A1 + A0T A0 , let diag(C ) = 0 Improve estimate of users’ bits via parallel interference cancellation (i ) ( 0) i −1) ( i −1) ( i −1) ycur = ycur − Ld (prev − Cd cur − Rd next (i ) (i ) d cur = sign( ycur ) Outline Algorithms for joint estimation and detection Wireless testbed (Simulink + RealSync) Multiprocessor implementation Results and conclusions Wireless testbed – Simulink Provide a rapid development / debug environment Generate data for a varieties of SNRs, users, spreading codes, channels Determine bit error rate Wireless testbed – Simulink Joint estimation and detection runs on DSP, while data generation, analysis runs on host! Wireless testbed – RealSync Simulink in Simulink out RealSync S-function GetMatrix(in1, in2) DSP PutMatrix(out) Estimate, detect Outline Algorithms for joint estimation and detection Wireless testbed (Simulink + RealSync) Multiprocessor implementation Results and conclusions Multiprocessor implementation Sundance multi-processor board with 3L Diamond multi-P OS • Twin TI TMS320C6701 processors • Twin Xilinx Virtex 300K gate FPGAs • 3L software allows easy reconfiguration of programs, tasks among processors Multiprocessor implementation Interprocessor communication via comm-ports (no shared memory) @ 5MB/sec. Blocks during data transfer. Task partitioning: estimator on one processor, detector on the other. Goal: keep both processors maximally busy Outline Algorithms for joint estimation and detection Wireless testbed (Simulink + RealSync) Multiprocessor implementation Results and conclusions Results – static single proc. 10 Performance (msec/bit) 10 10 10 10 S ingle-processor performance 1 Multi-user estimation P IC + C M F P IC o n ly CMF S lid i n g c o r r e l a t o r e s t i m a t i o n 0 -1 -2 -3 2 4 6 8 10 Number of users 12 14 16 Results – single vs. dual detection S ingle / d u a l - p r o c e s s o r d e t e c t o r c o m p a r i s o n 0.14 D u a l p r o c e s s o r s , P IC + C M F S ingle p r o c e s s o r , P IC + C M F Dual processors, CMF In t e r p r o c e s s o r c o m m o v e r h e a d S ingle p r o c e s s o r , C M F System performance (msec/bit) 0.12 0.1 0.08 0.06 0.04 0.02 0 2 4 6 8 10 N u m b e r o f us e r s 12 14 16 Results – tracking single vs. dual Projected single/dual-processor tracking comparison S ystem performance (msec/bit) 0.18 S ingle-processor Dual-processor (comm overhead) Dual-processor (no comm overhead) Inte rprocessor comm overhead 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 2 4 6 8 10 Number of users 12 14 16 Conclusions Performance measures should include channel estimation and detection time. Estimation and detection map well to a dual-processor implementation. “The right algorithms, the right tools… the real world”