Topical Workshop: FPGAs in Research - Applications, Technologies and Tools

Europe/Berlin
Room 110 in Building 2.5 (Zentralinstitut für Elektronik, Forschungszentrum Jülich)

Room 110 in Building 2.5

Zentralinstitut für Elektronik, Forschungszentrum Jülich

Description
Second topical workshop within the framework of the Helmholtz Portfolio "Detector Technologies and Systems Platform"
Participants
  • Andreas Erven
  • Andriy Ushakov
  • André Goerres
  • Björn Spruck
  • Cahit Ugur
  • Diana Goehringer
  • Felix Beckmann
  • Frantisek Krivan
  • Franz Peter Zantis
  • Günter Kemmerling
  • Harald Kleines
  • Holger Nöldgen
  • Hubert Gorke
  • Ivan Rusanov
  • Joern Plewka
  • Jose Rodrigo Azambuja
  • Jörg Burmester
  • Karl-Heinz Blumenhagen
  • Marius Wensing
  • Markus Dambacher
  • Matthias Balzer
  • Matthias Drochner
  • Michael Ramm
  • Michael Schnell
  • Michael Traxler
  • Michele Caselle
  • Peter Wuestner
  • Philipp Födisch
  • Qingqing Xia
  • Renhai Xiong
  • Sergey Suslov
  • Simone Esch
  • Stephan Meyer-Loges
  • Thomas Kleisch
  • Thomas Stöhlker
  • Tobias Stockmanns
  • Tom Neubert
  • Uwe Clemens
  • Uwe Spillmann
  • Wilhelm Erven
  • Monday, December 3
    • 12:00 PM
      Lunch Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

    • Talks MO1 Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

      • 1
        Introduction
        Organisational details of the workshop
        Speaker: Mr Harald Kleines (Forschungszentrum Jülich)
        Slides
      • 2
        Zentralinstitut für Elektronik - Facts, Figures, Tasks & Topics
        Short Overview of ZEL (Zentralinstitut für Elektronik)
        Speaker: Mr Harald Kleines (Forschungszentrum Jülich)
        Slides
      • 3
        FPGA-basierte digitale Gamma-Spektroskopie zur Überwachung der Umweltradioaktivität
        Radioaktive Zwischenfälle wie Tschernobyl und Fukushima haben gezeigt, dass eine flächendeckende Überwachung der Radioaktivität in Deutschland eine wichtige Aufgabe ist. Um dieses Vorhaben in Zukunft noch besser erfüllen zu können, soll das aktuelle Messnetz des Bundesamts für Strahlenschutz (BfS) mit spektroskopischen Systemen zur Nuklididentifikation erweitert werden. Der Vortrag zeigt die verschiedenen Schritte der Entwicklung eines solchen Systems, welche in einer Kooperation der Universität Freiburg und des BfS durchgeführt wurde. Die Basis des Systems bilden Detektoren auf Basis von (Cd,Zn)Te, die zusätzlich mit Hilfe einer Peltier-Kühlung bei einer festen Temperatur betrieben werden. Die Signale der Detektoren werden digitalisiert und mit Hilfe digitaler Filter wird in einem Xilinx Spartan FPGA die Pulshöhe und damit die Energie der Strahlung rekonstruiert. Es findet daher keine analoge Pulsformung statt. Schwerpunkte des Vortrags sind die Beschreibung der Funktionsweise und die Umsetzung der Filter im FPGA sowie die Erstellung von Energiespektren mit Hilfe einer State Machine in Zusammenspiel mit externem SDRAM. Als Schnittstelle zwischen FPGA und PC dient ein ATmega Mikroprozessor.
        Speaker: Dr Markus Dambacher (FMF)
        Slides
      • 4
        FPGA-based Tomographiecameras
        The Helmholtz-Zentrum Geesthacht develops new experiment environments for tomographie beamlines. Therfore a high resolution camera has been built with a E2V-CCD sensor with 2048*2048 pixel and a pixel size of 13.5µm². The framerate is beyond 1 fps but with a very high resolution of 16 bit. Another camera with a cmos-Chip (CV2000) which is built by the KIT will be intgrated into our experiment setup. The aim of this close cooperation with the KIT is an experiment setup for phase contrast investigation of organic matter. Very important parts of this setup are two plates with a very fine grid, where one plate has to be shifted in 10 nm steps in a few ms. It is very challenging to all of the setup having in mind the temperature stability which is necessary to keep the exact position. To correlate the data with the actual position the FPGA, which is programmed to handle the CMOS-Sensor and the datatransfer via a PCIe –link, has to be adapted and extended for the piezodrive control and more.
        Speakers: Mr Jörg Burmester (HZG), Mr Jörn Plewka (HZG)
        Slides
      • 5
        FPGA-based algorithm for center of gravity calculation of clustered signals using RAM for history storing
        FPGA-based algorithm for center of gravity calculation of clustered signals using RAM for history storing A.A. Ushakov, Ch. Schulz, Th. Wilpert Helmholtz-Zentrum Berlin für Materialen und Energie The acquisition system for a neutron detector consisting of 157Gd-CsI converter and a Micro-Strip Gas Chamber has been developed earlier [1]. For the prototype it incorporates 4 ASICs able to process 128 stripes of one detector coordinate delivering analog signals and digitized timing of incoming events. Further, analog signals are digitized on an additional ADC board. The examination of extracted signals is going on further with the help of FPGA. One neutron can create signals on 3-5 detector strips, thus the main task is to identify such clusters and determine their center of gravity. The center of gravity calculation takes place in a FPGA allowing real time processing. The data for calculation are arranged in 4 FIFOs with individual buffers each assigned to an ASIC. A cluster is identified by the following rules: signals have to have neighboring coordinates within a maximum distance of 5 strips and having a time stamp within a period of maximum three clock cycles. Due to the nature of the token-ring buffer the ASIC output is not well-ordered in time; however signals are unambiguously identifiable by their time stamp. The developed algorithm collects data from input FIFO buffers and stores data in Dual Port RAM with asymmetrical inputs and outputs. The data reading from input buffer is going sequentially therefore size of input Port corresponds to 64 bit of incoming data from one stripe. The row of RAM stores data about all 121 stripes of sensor including real and empty data. Input FIFO before RAM stores only essential data therefore the second input data Port initiates the next memory row with empty cells during one write cycle. The whole RAM block has an additional combinatorial logic which helps to recognize the situation when the next row initialization is necessary. The row output is accessible at any time and maintains full data right before next row initialization. After the full row was read out from RAM it is combined in register buffer. This helps to recognize out of date events which could be unambiguously recognized as clusters by the neighboring coordinates and time stamps within some time limit. After cluster recognition this data is eliminated from buffer at the same time this data used for center of gravity calculation. The available memory block resources allow us to keep the history about 4 ASIC reading cycles. The arithmetic core involves DSP resources like adders, accumulators and dividers which consume significant resources of a Virtex-5 FPGA and limits the performance. The optimal Virtex-5 FPGA utilization is reached with 12 dividers each of them accomplishes operation during 7 clocks with 125 MHz base frequency. The final design performs center of gravity calculation in pipelined mode in 3 stages of 4 dividers. In such a way we recognize and calculate up to 4 clusters at one 32.5 MHz clock. The algorithm delivers a calculated position resolution of 0.08 mm compared to a detector strip pitch of 0.635 mm. The memory based approach occupies approximately 90% of Block RAM and 65% of DSP48E resources implemented on Virtex-5 xc5vfx70t chip. The plan was defined plan to develop a new more intelligent division control block which dynamically distributes available DSP resources for clusters processing. So, we will have a number of engaged dividers and number of dividers ready to perform the division. The alternative solution could be the replacement of existent solution based on Virtex-5 mini-module to more powerful Kintex-7. We will benefit from resources which are 5 times higher. In the final design we are aiming to handle a count rate as high as 5•107 n/s/module. [1] Alimov, S. S. et al., "Development of very high rate and resolution neutron detectors with novel readout and DAQ hard- and software in DETNI," Nuclear Science Symposium Conference Record, 2008. NSS '08. IEEE , pp.1887-1900, 19-25 Oct. 2008 doi: 10.1109/NSSMIC.2008.4774759
        Speaker: Dr Andriy Ushakov (Helmholtz zentrum Berlin)
        Slides
      • 6
        Development of a FPGA based DAQ for digital SiPM‘s using USB 3.0
        The presentation will show concept and developments for a read-out system for Philips digital SiPM's. It utilizes a Spartan 6 FPGA to connect and to transfer data from these photon counting device. A USB 3.0 interface is used as a high-speed link of up to 5 Gbit/s to transfer the sensor data to a PC.
        Speaker: Mr Holger Noeldgen (Forschungszentrum Juelich)
        Slides
    • 3:15 PM
      Coffee Break Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

    • Talks MO2 Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

      • 7
        The ATCA based Compute Node and its application in the Belle II PXD-DAQ
        In this talk we present the Compute Node, an ATCA carrier board and AMC board design based on Virtex-4 FX60 and Virtex-5 FX70T FPGAs. The system is designed to perform data acquisition of 22 GB/s and data reduction by a factor <10 at the Belle II pixel detector, which is supposed to start operation in 08/2015. The firmware programming comprises buffer management with pointer lookup tables, DDR2 memory access using NPI (native port interface), optical link data transfer using GTX transceivers and Aurora 8B/10B, SERDES links and custom UDP and TCP/IP interfaces. A parallel region-of-interest (ROI) algorithm performs data reduction of the PXD data based upon charged track extrapolation from the high level trigger and silicon strip vertex detector, arriving with a large latency and out of order. In addition, PXD cluster charge analysis will be performed on the compute nodes for identification of slow pions from D* decays.
        Speaker: Bjoern Spruck (Uni Gießen)
        Slides
      • 8
        The Juelich Digital Readout System for $\overline{\text{P}}$anda development
        The $\overline{\text{P}}$anda detector is one of the main experiments at the upcoming Facility for Antiproton and Ion Research in Darmstadt (FAIR). The fixed target experiment will explore $\overline{\text{p}}$p annihilation with intense, phase space cooled beams with momenta between 1.5 and 15 GeV/c. For the development of the Micro Vertex Detector (MVD), the innermost tracking detector of $\overline{\text{P}}$anda, the evaluation of prototypes and detector parts is very important. Different prototypes of the pixel front-end chip ToPix (Torino Pixel) need to be tested and characterized under similar conditions to improve the development. To control these devices unter test (DUT) and to save the taken data a suitable readout system is necessary. To have similar conditions for different prototypes a modular concept of a readout system is required which can be adapted in a simple way to the specific interface of different types of electronics. To meet the requirements of an upcoming full size ToPix prototype and online analysis an upgrade of the Juelich Digital Readout System was developed. The Xilinx ML605 evaluation board with the Virtex 6 FPGA is the main harware component of the upgraded system providing a 1 GBit/s optical connection and 2Gb DDR3 RAM. The DUT can be connected via a 160 pin free configurable connector to the FPGA. An overview about the system componentens and measurements of the ToPix prototype with the new readout system will be shown.
        Speaker: Simone Esch (Forschungszentrum Jüich)
        Slides
      • 9
        Data Concentrator for the Belle II DEPFET Pixel Detector
        The innermost two layers of the Belle II detector located at the KEK facility in Tsukuba, Japan, will be covered by high granularity DEPFET pixel (PXD) sensors. This leads to a high data rate of around 60 Gbps, which has to be significantly reduced by the Data Acquisition System. For the data reduction the hit information of the surrounding silicon strip detector (SVD) is used to define so-called Regions of Interest (ROI) and only the information of the pixels located inside these ROIs are saved. The ROIs for the pixel detector are computed by reconstructing track segments from SVD data and back extrapolation to the PXD. A data reduction of up to a factor of 10 can be achieved this way. All the necessary processing stages, the receiving and multiplexing of the data on many optical links from the SVD, the track reconstruction and the definition of the ROIs, will be performed by the Data Concentrator. The planned hardware design is based on a distributed set of Advanced Mezzanine Cards (AMC) each equipped with a Field Programmable Gate Array (FPGA) chip and 4 optical transceivers. In this talk, the hardware and firmware development of the algorithms to multiplex the incoming data streams on Xilinx Rocket IOs and the necessary pre-processing steps in each FPGA are discussed. In addition, a prototype implementation of the FPGA-based tracking algorithm will be presented with some preliminary simulation results.
        Speaker: Michael Schnell (University of Bonn)
        Slides
      • 10
        Firmware-Entwicklung für die ATLAS IBL BOC Karte
        Für das Upgrade des ATLAS Pixel-Detektors am Large-Hadron-Collider ist ein Neudesign der Datenauslese notwendig. Im Insertable b-Layer werden 448 zusätzliche Front-End-Chips verbaut, für deren Auslese neue FPGA-Auslesekarten bestehend aus Back-of-Crate- (BOC) und Read-Out-Driver-Karte (ROD) entwickelt worden sind. Der Vortrag beschäftigt sich mit der Firmware-Entwicklung und den ersten Firmware-Tests für die BOC-Karte. Dabei wird sowohl die Steuerung der Karte als auch die Datenverarbeitung in den FPGAs näher beleuchtet. Besonderes Augenmerk liegt hier auf den 40 bzw. 160 MBit/s Datenpfaden vom/zum Detektor. Im 40 MBit/s Datenpfad zum Detektor werden verschiedene Ansätze zur feinen Verzögerung des Signals im 50 – 100 ps Bereich gezeigt. Beim 160 MBit/s liegt der Schwerpunkt in der Verarbeitung der vom Detektor kommenden 8b10b-kodierten Signale.
        Speaker: Mr Marius Wensing (Bergische Universität Wuppertal)
        Slides
    • 7:00 PM
      Dinner Große Rurstraße 94 (Restaurant Am Hexentrum)

      Große Rurstraße 94

      Restaurant Am Hexentrum

  • Tuesday, December 4
    • Invited Talk Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

      • 11
        High Level Synthesis with Xilinx HLS
        Agenda Functional Abstraction Level High Level Synthesis HLS Control & Datapath Extraction Scheduling & Binding Arbitrary Precision Data Types Top Level I/O Ports Loops Arrays Interfaces First Example Latency & Throughput Optimizations
        Speaker: Mr Eugen Krassin (plc2)
        Slides
    • 10:30 AM
      Coffee Break Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

    • Talks TUE1 Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

      • 12
        FPGA Programming Methods - An Overview
        This talk will give an overview about the different programming methods available for FPGAs: Hardware Description Languages (HDL), Model-based Design (Matlab/Simulink), High-Level-Synthesis Tools (HLS) and IP-Cores.
        Speaker: Diana Goehringer (KIT)
        Slides
      • 14
        Parallelisation potential of image segmentation in hierarchical island structures on hardware-accelerated platforms in real-time applications
        The presented work addresses two types of compact HPC platforms found to be most successful nowadays: FPGA-based expansion cards and graphics processing unit coprocessing boards. The FPGA and GPU architectures are shortly discussed to identify the major aspects of the application design for these platforms. The application in focus is a fast automated image segmentation method (GSC, Grey Value Structure Code). This complex method is feasible for different application areas and provides high-quality segmentation results. An analysis of the parallelisation potential of the applied method is carried out. Relying on many statistical measurements and results of versatile system models the GSC algorithm is specially reelaborated for the implementation on the two massive parallel computation platforms to achieve a high performance needed for real-time application set-ups. A special attention has been paid to the question of an effective computation organisation for the target platforms. The two implementations are compared to highlight their relative merits and downsides for this complex and computation intensive application. The results of the work show that even having a considerably longer development cycle the FPGA-based solution on the Xilinx Virtex II Pro architecture can compete with the implementation on the specialised nVidia Tesla C1060 card. Compared to a single CPU (Opteron 2.6 GHz) the FPGA accelerates the application by a factor of about 23, while the GPU outperforms with factors of 13 to 20 dependent on the image resolution.
        Speaker: Mr Sergey Suslov (Research Center Juelich, Central Institute for Electronics)
        Slides
    • 12:15 PM
      Lunch Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

    • Talks TUE2 Room 110 in Building 2.5

      Room 110 in Building 2.5

      Zentralinstitut für Elektronik, Forschungszentrum Jülich

      • 15
        Field Programmable Gate Array Based Data Digitisation with Commercial Elements
        One of the most important aspects of particle identification experiments is the digitisation of time, amplitude and charge data from detectors. These conversions are mostly undertaken with application Specific Integrated Circuits (ASICs). However, recent developments in Field Programmable Gate Array (FPGA) technology allow us to use commercial electronic components for the required Front-End Electronics (FEE) and to do the digitisation in the FPGA. It is possible to do Time-of-Flight (ToF), Time-over-Threshold (ToT), amplitude and charge measurements with converters implemented in FPGA. We call this principle come & kiss: Use COmplex ComMErcial Elements & Keep It Small and Simple.
        Speaker: Mr Cahit Ugur (GSI)
        Slides
      • 16
        An FPGA Platform for Ultra-fast Data Acquisition
        The next generation of physical experiments demands high-throughput data readout systems combined with embedded data processing. We will present the current status of a multi-purpose high-performance data-acquisition system that has been developed at KIT. The design is based on three customizable IP cores used to build a multi-purpose and high bandwidth DAQ system: The first IP-core is a PCIe interface including a Bus Master DMA (BDM) architecture with a bandwidth of up to 32Gb/s. The second IP-core is a multi-port DDR3 memory interface that works with up to 51Gb/s. The last IP-core is a fast SerDes input stage with automatic parallel data pattern alignment logic. A 64-bit Linux driver seamlessly integrates the DAQ platform into any CPU/GPU server infrastructure. The multi-purpose readout architecture and its application to selected physical experiments are presented and the performance will be discussed. The remaining bottleneck for current DAQ systems at this high data rates is the storage of the data. We propose solutions to overcome this limitation with emerging electronics components and ultra-fast serial links. We will discuss both FPGA peer-to-peer connections, based on PCIe Gen3 and fast local data storage with NAND Flash solid state storage, and an FPGA-to-network architecture. The second approach embeds the DAQ platform directly in high-performance networks, well established for super computing (e.g. Infiniband).
        Speaker: Dr Michele Caselle (KIT)
        Slides
      • 17
        FPGA Oriented HW/FW Development in FEA DESY
        New FPGA Technologies (like XILINX Series 7 FPGAs) provide new and elegant approaches for high speed DAQ systems. High speed transceivers with data rates of more than 10Gbit/s and new scalable internal bus systems simplify these developments. Our group is working together with KIT on the development of a versatile DAQ card in the µTCA / AMC standard. This talk will give an overview about the current developments and an outlook to possible future activities.
        Speaker: Mr Frantisek krivan (DESY)
        Slides
      • 18
        FPGA- Framework development for a MicroTCA system
        The MTCA.4 standard will be widely used in the machine control system of the XFEL. Our electronics development group FE at DESY has developed a FPGA-based readout module (DAMC2) interfacing dedicated μRTM modules for several applications in this field. To simplify the FPGA-Firmware development of different control applications a framework has been developed which covers most of the peripheral interfaces of the board and allows the user to concentrate on the application related interfaces and algorithms. This talk will give a short overview about current digital developments in our group and illustrates the framework with two user applications under development.
        Speaker: Mrs Qingqing Xia (DESY)
        Slides
      • 19
        Dynamische Re -Konfiguration eines FPGAs mittels internem Kontroller
        FPGA basierende Elektronikentwicklungen sind stark geprägt von der hohen Flexibilität der Konfiguration insbesondere in der Entwicklungsphase und im späteren Einsatz in der Applikation. Die Anforderungen an das Design steigen, wenn FPGAs bspw. auch unter erschwerten Umweltbedingungen zum Einsatz kommen. Insbesondere bei Satellitenapplikationen sind redundante Konzepte, strahlungsfeste Baugruppen sowie die Re-Konfiguration des Designs zu berücksichtigen. Der Vortrag wird eine mögliche Vorgehensweise aufzeigen, wie die Konfiguration des FPGAs während des Betriebs dynamisch erneuert werden kann. Am Beispiel der weltraumtauglichen Xilinx Virtex-4 FPGA Familie wird zum einen die Konfiguration über die ICAP Schnittstelle, sowie mittels externem Kontroller via SelectMAP Schnittstelle vorgestellt. Dabei wird die prinzipielle Vorgehensweise demonstriert und beide Ansätze hinsichtlich Geschwindigkeit und Sicherheitsaspekte betrachtet.
        Speaker: Mr Markus Dick (Forschungszentrum Juelich)
        Slides