Publications

42 entries « ‹ 1 of 3 › »

Filgueras, A; Vidal, M; Mateu, M; Jiménez-González, D; Álvarez, C; Martorell, X; Ayguadé, E; Theodoropoulos, D; Pnevmatikatos, D; Gai, P; Garzarella, S; Oro, D; Hernando, J; Bettin, N; Pomella, A; Procaccini, M; Giorgi, R

The AXIOM Project: IoT on Heterogeneous Embedded Platforms Journal Article

In: IEEE Design and Test, vol. pre-print, pp. 1-6, 2019, ISSN: 2168-2356.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Translating Timing into an Architecture: the Synergy of COTSon and HLS (Domain Expertise: Designing a Computer Architecture via HLS) Journal Article

In: pp. 1–18, 2019, ISSN: 1687-7209.

Abstract | Links | BibTeX

@article{Giorgi19-ijrc,

title = {Translating Timing into an Architecture: the Synergy of COTSon and HLS (Domain Expertise: Designing a Computer Architecture via HLS)},

author = {Roberto Giorgi and Farnam Khalili and Marco Procaccini},

doi = {10.1155/2019/2624938},

issn = {1687-7209},

year  = {2019},

date = {2019-09-01},

booktitle = {International Journal of Reconfigurable Computing},

pages = {1--18},

address = {London, UK},

abstract = {Translating a system requirement into a low-level representation (e.g., register transfer level or RTL) is the typical goal of the design of FPGA based systems. However, the Design Space Exploration (DSE) needed to identify the final architecture may be time consuming, even when using High Level Synthesis (HLS) tools. 

 

In this paper, we illustrate our hybrid methodology, which uses a frontend to HLS so that the DSE is performed more rapidly by using a higher-level abstraction, but without loosing accuracy, thanks to the HP-Labs COTSon simulation infrastructure in combination with our DSE tools (MYDSE tools). In particular, this proposed methodology proved useful to achieve appropriate design of a whole system in a shorter time than trying to design everything directly in HLS. 

 

Our motivating problem was to deploy a novel execution model called Data-Flow Threads (DF-Threads) running on yet to be designed hardware. For that goal, directly using the HLS was too premature in the design cycle. Therefore, a key point of our methodology consists in defining the first prototype in our simulation framework and gradually migrating the design into the Xilinx HLS after validating the key performance metrics of our novel system in the simulator. 

To explain this workflow, we first use a simple driving example consisting in the modelling of a two-way associative cache. Then, we explain how we generalized this methodology and describe the types of results that we were able to analyze in the AXIOM project, that helped us reduce the development time from months/weeks to days/hours.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Giorgi, Roberto; Procaccini, Marco

Bridging a Data-Flow Execution Model to a Simple Programming Model Proceedings Article

In: IEEE Proc. of the International Conference on High Performance Computing and Simulation (HPCS), pp. 165-168, Dublin, Ireland, 2019, ISBN: 978-1-7281-4484-9.

Abstract | Links | BibTeX

Giorgi, Roberto; Procaccini, Marco

Bridging a Data-Flow Execution Model to a Simple Programming Model Proceedings Article

In: IEEE Proc. of the International Conference on High Performance Computing and Simulation (HPCS), pp. 165-168, Dublin, Ireland, 2019, ISBN: 978-1-7281-4484-9.

Abstract | Links | BibTeX

Roberto, Bettin Nicola Giorgi; Ermini, Sara; Montefoschi, Francesco; Rizzo, Antonio

An Iris+Voice Recognition Systemfor a Smart Doorbell Proceedings Article

In: IEEE 8th Mediterranean Conference on Embedded Computing (MECO), pp. 419-422, 2019, ISSN: 2377-5475.

Abstract | Links | BibTeX

Roberto, Oro David Giorgi; Ermini, Sara; Montefoschi, Francesco; Rizzo, Antonio

Embedded Face Analysis for Smart Videosurveillance Proceedings Article

In: IEEE 8th Mediterranean Conference on Embedded Computing (MECO), pp. 403-407, 2019, ISSN: 2377-5475.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

AXIOM: A Scalable, Efficient and Reconfigurable Embedded Platform Proceedings Article

In: IEEE Proceedings of Design, Automation and Test in Europe (DATE), pp. 1–6, Florence, Italy, 2019, ISBN: 978-3-9819263-3-0.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Analyzing the Impact of Operating System Activity of different Linux Distributions in a Distributed Environment Proceedings Article

In: IEEE Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 422-429, Pavia, Italy, 2019, ISBN: 978-1-7281-1644-0.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

A Design Space Exploration Tool Set for Future 1K-core High-Performance Computers Proceedings Article

In: ACM Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO), pp. 1–6, Valencia, Spain, 2019, ISBN: 978-1-4503-6260-3.

Abstract | Links | BibTeX

10.

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Energy Efficiency Exploration on the ZYNQ Ultrascale+ Proceedings Article

In: IEEE Proceedings of the 30th International Conference on Microelectronics (ICM), pp. 52-55, Sousse, Tunisia, 2018, ISBN: 978-1-5386-8166-4.

Abstract | BibTeX

11.

Rizzo, Antonio; Caporali, Maurizio; Montefoschi, Francesco; Ermini, Sara; Oro, David; Hupont, Isabelle; Bettin, Nicola

Prototyping Edge Computing Services for IoT Unpublished Forthcoming

Forthcoming.

BibTeX

12.

hao Xu, Ying; Vidal, Miquel; Arejita, Beñat; Diaz, Javier; Alvarez, Carlos; Jiménez-González, Daniel; Martorell, Xavier; Mantovani, Filippo

Implementation of the K-Means Algorithm on Heterogeneous Devices: A Use Case Based on an Industrial Dataset Proceedings Article

In: Advances in Parallel Computing, pp. 642-651, IOS Press, 2018, ISBN: 0927-5452.

Abstract | Links | BibTeX

13.

Vasileios, Amourgianos-Lorentzos

Efficient network interface design for low cost distributed systems Masters Thesis

2017.

Abstract | Links | BibTeX

14.

Rizzo, Antonio; Montefoschi, Francesco; Caporali, Maurizio; Gisondi, Antonio; Burresi, Giovanni; Giorgi, Roberto

Rapid prototyping IoT solutions based on Machine Learning Proceedings Article

In: Conference: the European Conference, 2017.

Abstract | Links | BibTeX

15.

Giorgi, Roberto

AXIOM: A 64-bit reconfigurable hardware/software platform for scalable embedded computing Proceedings Article

In: Embedded Computing (MECO), 2017 6th Mediterranean Conference on, IEEE, 2017, ISBN: 978-1-5090-6742-8.

Abstract | Links | BibTeX

@inproceedings{Giorgi2017,

title = {AXIOM: A 64-bit reconfigurable hardware/software platform for scalable embedded computing},

author = {Roberto Giorgi},

url = {http://ieeexplore.ieee.org/abstract/document/7977173/},

doi = {10.1109/MECO.2017.7977173},

isbn = {978-1-5090-6742-8},

year  = {2017},

date = {2017-07-13},

booktitle = {Embedded Computing (MECO), 2017 6th Mediterranean Conference on},

publisher = {IEEE},

abstract = {The AXIOM platform is built with, in mind, the possibility of executing an application not only on a single board but also, in a distributed fashion, on multiple boards. While this is a classic problem with some solutions in the case of no constraints, it becomes interesting for embedded computing and cyber-physical systems where we aim to accelerate applications while maintaining energy efficiency and also easy programmability. Currently, the AXIOM platform consists of a custom board based on the Xilinx Zynq Ultrascale+ ZU9EG which incorporates the largest FPGA available on that System-on-Chip at the moment, four 64-bit ARM cores and two 32-bit ARM cores, up to 32GiB of main memory and several 12.5Gbit/s tranceivers. We relyed on this hardware to develop our novel concept, which exploits dataflow execution in multiple ways for programs that are written in an OpenMP extension, known as OmpSs. A key aspect relates to the adopted memory consistency model, which allows the programmer to focus on aspects other than taking care of the communication among nodes. The lower level of our communication stack relies on a fast interconnect based on inexpensive USB-C type connectors rather than on other proprietary interfaces. The reconfigurable logic provides a complete Network Interface Card (NIC) to allow fast routing of the data and code of the system. We envision many applications for this platform although we are currently focused on developing two basic scenarios based on the Smart-Home and on Smart-Video surveillance. Our initial results confirm good scalability of the platform and a speed-up compared to other programming models such as Cilk and OpenMPI.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

16.

Pons, Jaume Bosch

Asynchronous runtime for task-based dataflow programming models Masters Thesis

2017.

Abstract | BibTeX

@mastersthesis{Pons2017,

title = {Asynchronous runtime for task-based dataflow programming models},

author = {Jaume Bosch Pons},

year = {2017},

date = {2017-07-01},

abstract = {The importance of parallel programming is increasing year after year since the power wall popularized multi-core processors, and with them, shared memory parallel programming models. In particular, task-based programming models, like the standard OpenMP 4.0, have become more and more important. They allow describing a set of data dependences per task that the runtime uses to order the execution of tasks. This order is calculated using shared graphs, which are updated by all threads but in exclusive access using synchronization mechanisms (locks) to ensure the dependences correctness. Although exclusive accesses are necessary to avoid data race conditions, those may imply contention that limits the application parallelism. This becomes critical in many-core systems because several threads may be wasting computation resources waiting to access the runtime structures. This master thesis introduces the concept of an asynchronous runtime management suitable for task-based programming model runtimes. The runtime proposal is based on the asynchronous management of the runtime structures like task dependence graphs. Therefore, the application threads request actions to the runtime instead of directly executing the needed modifications. The requests are then handled by a runtime manager which can be implemented in different ways. This master thesis presents an extension to a previously implemented centralized runtime manager and presents a novel implementation of a distributed runtime manager. On one hand, the runtime design based on a centralized manager [1] is extended to dynamically adapt the runtime behavior according to the manager load with the objective of being as fast as possible. On the other hand, a novel runtime design based on a distributed manager implementation is proposed to overcome the limitations observed in the centralized design. The distributed runtime implementation allows any thread to become a runtime manager thread if it helps to exploit the application parallelism. That is achieved using a new runtime feature, also implemented in this master thesis, for runtime functionality dispatching through a callback system. The proposals are evaluated in different many-core architectures and their performance is compared against the baseline runtimes used to implement the asynchronous versions. Results show that the centralized manager extension can overcome the hard limitations of the initial basic implementation, that the distributed manager fixes the observed problems in previous implementation, and the proposed asynchronous organization significantly outperforms the speedup obtained by the original runtime for real benchmarks.},

keywords = {},

pubstate = {published},

tppubtype = {mastersthesis}

}

The importance of parallel programming is increasing year after year since the power wall popularized multi-core processors, and with them, shared memory parallel programming models. In particular, task-based programming models, like the standard OpenMP 4.0, have become more and more important. They allow describing a set of data dependences per task that the runtime uses to order the execution of tasks. This order is calculated using shared graphs, which are updated by all threads but in exclusive access using synchronization mechanisms (locks) to ensure the dependences correctness. Although exclusive accesses are necessary to avoid data race conditions, those may imply contention that limits the application parallelism. This becomes critical in many-core systems because several threads may be wasting computation resources waiting to access the runtime structures. This master thesis introduces the concept of an asynchronous runtime management suitable for task-based programming model runtimes. The runtime proposal is based on the asynchronous management of the runtime structures like task dependence graphs. Therefore, the application threads request actions to the runtime instead of directly executing the needed modifications. The requests are then handled by a runtime manager which can be implemented in different ways. This master thesis presents an extension to a previously implemented centralized runtime manager and presents a novel implementation of a distributed runtime manager. On one hand, the runtime design based on a centralized manager [1] is extended to dynamically adapt the runtime behavior according to the manager load with the objective of being as fast as possible. On the other hand, a novel runtime design based on a distributed manager implementation is proposed to overcome the limitations observed in the centralized design. The distributed runtime implementation allows any thread to become a runtime manager thread if it helps to exploit the application parallelism. That is achieved using a new runtime feature, also implemented in this master thesis, for runtime functionality dispatching through a callback system. The proposals are evaluated in different many-core architectures and their performance is compared against the baseline runtimes used to implement the asynchronous versions. Results show that the centralized manager extension can overcome the hard limitations of the initial basic implementation, that the distributed manager fixes the observed problems in previous implementation, and the proposed asynchronous organization significantly outperforms the speedup obtained by the original runtime for real benchmarks.

17.

Theodoropoulos, Dimitris; Mazumdar, Somnath; Ayguade, Eduard; Bettin, Nicola; Bueno, Javier; Ermini, Sara; Filgueras, Antonio; Jiménez-González, Daniel; Martínez, Carlos Álvarez; Martorell, Xavier; Montefoschi, Francesco; Oro, David; Pnevmatikatos, Dionisis; Rizzo, Antonio; Gai, Paolo; Garzarella, Stefano; Morelli, Bruno; Pomella, Alberto; Giorgi, Roberto

The AXIOM Platform for Next-generation Cyber Physical Systems Book Section

In: B.V, Elsevier (Ed.): Microprocessors and Microsystems, Elsevier B.V, 2017.

Abstract | Links | BibTeX

18.

Wagner, Michael; Llort, Germán; Filgueras, Antonio; Jiménez-González, Daniel; Servat, Harald; Teruel, Xavier; Mercadal, Estanislao; Álvarez, Carlos; Giménez, Judit; Martorell, Xavier; Ayguadé, Eduard; Labarta, Jesús

Monitoring Heterogeneous Applications with the OpenMP Tools Interface Proceedings Article

In: Springer, (Ed.): Tools for High Performance Computing 2016, pp. 41-57, Springer, 2017.

Abstract | Links | BibTeX

@inproceedings{Wagner2017,

title = {Monitoring Heterogeneous Applications with the OpenMP Tools Interface},

author = {Michael Wagner and Germ\'{a}n Llort and Antonio Filgueras and Daniel Jim\'{e}nez-Gonz\'{a}lez and Harald Servat and Xavier Teruel and Estanislao Mercadal and Carlos \'{A}lvarez and Judit Gim\'{e}nez and Xavier Martorell and Eduard Ayguad\'{e} and Jes\'{u}s Labarta},

editor = {Springer},

url = {https://link.springer.com/chapter/10.1007/978-3-319-56702-0_3},

doi = {10.1007/978-3-319-56702-0_3},

year  = {2017},

date = {2017-05-09},

booktitle = {Tools for High Performance Computing 2016},

journal = {Tools for High Performance Computing 2016},

pages = {41-57},

publisher = {Springer},

abstract = {Heterogeneous systems are gaining more importance in supercomputing, yet they are challenging to program and developers require support tools to understand how well their accelerated codes perform and how they can be improved. The OpenMP Tools Interface (OMPT) is a new performance monitoring interface that is being considered for integration into the OpenMP standard. OMPT allows monitoring the execution of heterogeneous OpenMP applications by revealing the activity of the runtime through a standardized API as well as facilitating the exchange of performance information between devices with accelerated codes, and the analysis tool. In this paper we describe our efforts implementing parts of the OMPT specification necessary to monitor accelerators. In particular, the integration of the OMPT features to our parallel runtime system and instrumentation framework helps to obtain detailed performance information about the execution of the accelerated tasks issued to the devices to allow an insightful analysis. As a result of this analysis, the parallel runtime of the programming model has been improved. We focus on the evaluation of monitoring FPGA devices studying the performance of a common kernel in scientific algorithms: matrix multiplication. Nonetheless, this development is as well applicable to monitor GPU accelerators and Intel®; Xeon PhiTM co-processors operating under the OmpSs programming model.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

19.

Rizzo, Antonio; Burresi, Giovanni; Montefoschi, Francesco; Caporali, Maurizio; Giorgi, Roberto

Making IoT with UDOO Journal Article

In: Interaction Design and Architecture(s), vol. 1, no. 30, pp. 95-112, 2016, ISSN: 1826-9745.

Abstract | Links | BibTeX

20.

Giorgi, Roberto; Mazumdar, Somnath; Viola, Stefano; Gai, Paolo; Garzarella, Stefano; Morelli, Bruno; Dionisios, Pnevmatikatos; Theodoropoulos, Dimitris; Alvarez, Carlos; Ayguade, Eduard; Bueno, Javier; Antonio, Filgueras; Jimenez-Gonzalez, Daniel; Martorell, Xavier

Modeling Multi-Board Communication in the AXIOM Cyber-Physical System Journal Article

In: Ada User Journal, vol. 37, no. 4, pp. 228-235, 2016, ISSN: 1381-6551.

Abstract | BibTeX

42 entries « ‹ 1 of 3 › »

2019

The AXIOM Project: IoT on Heterogeneous Embedded Platforms Journal Article

In: IEEE Design and Test, vol. pre-print, pp. 1-6, 2019, ISSN: 2168-2356.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Translating Timing into an Architecture: the Synergy of COTSon and HLS (Domain Expertise: Designing a Computer Architecture via HLS) Journal Article

In: pp. 1–18, 2019, ISSN: 1687-7209.

Abstract | Links | BibTeX

@article{Giorgi19-ijrc,

title = {Translating Timing into an Architecture: the Synergy of COTSon and HLS (Domain Expertise: Designing a Computer Architecture via HLS)},

author = {Roberto Giorgi and Farnam Khalili and Marco Procaccini},

doi = {10.1155/2019/2624938},

issn = {1687-7209},

year  = {2019},

date = {2019-09-01},

booktitle = {International Journal of Reconfigurable Computing},

pages = {1--18},

address = {London, UK},

abstract = {Translating a system requirement into a low-level representation (e.g., register transfer level or RTL) is the typical goal of the design of FPGA based systems. However, the Design Space Exploration (DSE) needed to identify the final architecture may be time consuming, even when using High Level Synthesis (HLS) tools. 

 

In this paper, we illustrate our hybrid methodology, which uses a frontend to HLS so that the DSE is performed more rapidly by using a higher-level abstraction, but without loosing accuracy, thanks to the HP-Labs COTSon simulation infrastructure in combination with our DSE tools (MYDSE tools). In particular, this proposed methodology proved useful to achieve appropriate design of a whole system in a shorter time than trying to design everything directly in HLS. 

 

Our motivating problem was to deploy a novel execution model called Data-Flow Threads (DF-Threads) running on yet to be designed hardware. For that goal, directly using the HLS was too premature in the design cycle. Therefore, a key point of our methodology consists in defining the first prototype in our simulation framework and gradually migrating the design into the Xilinx HLS after validating the key performance metrics of our novel system in the simulator. 

To explain this workflow, we first use a simple driving example consisting in the modelling of a two-way associative cache. Then, we explain how we generalized this methodology and describe the types of results that we were able to analyze in the AXIOM project, that helped us reduce the development time from months/weeks to days/hours.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Giorgi, Roberto; Procaccini, Marco

Bridging a Data-Flow Execution Model to a Simple Programming Model Proceedings Article

In: IEEE Proc. of the International Conference on High Performance Computing and Simulation (HPCS), pp. 165-168, Dublin, Ireland, 2019, ISBN: 978-1-7281-4484-9.

Abstract | Links | BibTeX

Giorgi, Roberto; Procaccini, Marco

Bridging a Data-Flow Execution Model to a Simple Programming Model Proceedings Article

In: IEEE Proc. of the International Conference on High Performance Computing and Simulation (HPCS), pp. 165-168, Dublin, Ireland, 2019, ISBN: 978-1-7281-4484-9.

Abstract | Links | BibTeX

Roberto, Bettin Nicola Giorgi; Ermini, Sara; Montefoschi, Francesco; Rizzo, Antonio

An Iris+Voice Recognition Systemfor a Smart Doorbell Proceedings Article

In: IEEE 8th Mediterranean Conference on Embedded Computing (MECO), pp. 419-422, 2019, ISSN: 2377-5475.

Abstract | Links | BibTeX

Roberto, Oro David Giorgi; Ermini, Sara; Montefoschi, Francesco; Rizzo, Antonio

Embedded Face Analysis for Smart Videosurveillance Proceedings Article

In: IEEE 8th Mediterranean Conference on Embedded Computing (MECO), pp. 403-407, 2019, ISSN: 2377-5475.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

AXIOM: A Scalable, Efficient and Reconfigurable Embedded Platform Proceedings Article

In: IEEE Proceedings of Design, Automation and Test in Europe (DATE), pp. 1–6, Florence, Italy, 2019, ISBN: 978-3-9819263-3-0.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Analyzing the Impact of Operating System Activity of different Linux Distributions in a Distributed Environment Proceedings Article

In: IEEE Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 422-429, Pavia, Italy, 2019, ISBN: 978-1-7281-1644-0.

Abstract | Links | BibTeX

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

A Design Space Exploration Tool Set for Future 1K-core High-Performance Computers Proceedings Article

In: ACM Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (RAPIDO), pp. 1–6, Valencia, Spain, 2019, ISBN: 978-1-4503-6260-3.

Abstract | Links | BibTeX

2018

Giorgi, Roberto; Khalili, Farnam; Procaccini, Marco

Energy Efficiency Exploration on the ZYNQ Ultrascale+ Proceedings Article

In: IEEE Proceedings of the 30th International Conference on Microelectronics (ICM), pp. 52-55, Sousse, Tunisia, 2018, ISBN: 978-1-5386-8166-4.

Abstract | BibTeX

Rizzo, Antonio; Caporali, Maurizio; Montefoschi, Francesco; Ermini, Sara; Oro, David; Hupont, Isabelle; Bettin, Nicola

Prototyping Edge Computing Services for IoT Unpublished Forthcoming

Forthcoming.

BibTeX

hao Xu, Ying; Vidal, Miquel; Arejita, Beñat; Diaz, Javier; Alvarez, Carlos; Jiménez-González, Daniel; Martorell, Xavier; Mantovani, Filippo

Implementation of the K-Means Algorithm on Heterogeneous Devices: A Use Case Based on an Industrial Dataset Proceedings Article

In: Advances in Parallel Computing, pp. 642-651, IOS Press, 2018, ISBN: 0927-5452.

Abstract | Links | BibTeX

2017

Vasileios, Amourgianos-Lorentzos

Efficient network interface design for low cost distributed systems Masters Thesis

2017.

Abstract | Links | BibTeX

Rizzo, Antonio; Montefoschi, Francesco; Caporali, Maurizio; Gisondi, Antonio; Burresi, Giovanni; Giorgi, Roberto

Rapid prototyping IoT solutions based on Machine Learning Proceedings Article

In: Conference: the European Conference, 2017.

Abstract | Links | BibTeX

Giorgi, Roberto

AXIOM: A 64-bit reconfigurable hardware/software platform for scalable embedded computing Proceedings Article

In: Embedded Computing (MECO), 2017 6th Mediterranean Conference on, IEEE, 2017, ISBN: 978-1-5090-6742-8.

Abstract | Links | BibTeX

@inproceedings{Giorgi2017,

title = {AXIOM: A 64-bit reconfigurable hardware/software platform for scalable embedded computing},

author = {Roberto Giorgi},

url = {http://ieeexplore.ieee.org/abstract/document/7977173/},

doi = {10.1109/MECO.2017.7977173},

isbn = {978-1-5090-6742-8},

year  = {2017},

date = {2017-07-13},

booktitle = {Embedded Computing (MECO), 2017 6th Mediterranean Conference on},

publisher = {IEEE},

abstract = {The AXIOM platform is built with, in mind, the possibility of executing an application not only on a single board but also, in a distributed fashion, on multiple boards. While this is a classic problem with some solutions in the case of no constraints, it becomes interesting for embedded computing and cyber-physical systems where we aim to accelerate applications while maintaining energy efficiency and also easy programmability. Currently, the AXIOM platform consists of a custom board based on the Xilinx Zynq Ultrascale+ ZU9EG which incorporates the largest FPGA available on that System-on-Chip at the moment, four 64-bit ARM cores and two 32-bit ARM cores, up to 32GiB of main memory and several 12.5Gbit/s tranceivers. We relyed on this hardware to develop our novel concept, which exploits dataflow execution in multiple ways for programs that are written in an OpenMP extension, known as OmpSs. A key aspect relates to the adopted memory consistency model, which allows the programmer to focus on aspects other than taking care of the communication among nodes. The lower level of our communication stack relies on a fast interconnect based on inexpensive USB-C type connectors rather than on other proprietary interfaces. The reconfigurable logic provides a complete Network Interface Card (NIC) to allow fast routing of the data and code of the system. We envision many applications for this platform although we are currently focused on developing two basic scenarios based on the Smart-Home and on Smart-Video surveillance. Our initial results confirm good scalability of the platform and a speed-up compared to other programming models such as Cilk and OpenMPI.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Pons, Jaume Bosch

Asynchronous runtime for task-based dataflow programming models Masters Thesis

2017.

Abstract | BibTeX

@mastersthesis{Pons2017,

title = {Asynchronous runtime for task-based dataflow programming models},

author = {Jaume Bosch Pons},

year = {2017},

date = {2017-07-01},

keywords = {},

pubstate = {published},

tppubtype = {mastersthesis}

}

The AXIOM Platform for Next-generation Cyber Physical Systems Book Section

In: B.V, Elsevier (Ed.): Microprocessors and Microsystems, Elsevier B.V, 2017.

Abstract | Links | BibTeX

Monitoring Heterogeneous Applications with the OpenMP Tools Interface Proceedings Article

In: Springer, (Ed.): Tools for High Performance Computing 2016, pp. 41-57, Springer, 2017.

Abstract | Links | BibTeX

@inproceedings{Wagner2017,

title = {Monitoring Heterogeneous Applications with the OpenMP Tools Interface},

author = {Michael Wagner and Germ\'{a}n Llort and Antonio Filgueras and Daniel Jim\'{e}nez-Gonz\'{a}lez and Harald Servat and Xavier Teruel and Estanislao Mercadal and Carlos \'{A}lvarez and Judit Gim\'{e}nez and Xavier Martorell and Eduard Ayguad\'{e} and Jes\'{u}s Labarta},

editor = {Springer},

url = {https://link.springer.com/chapter/10.1007/978-3-319-56702-0_3},

doi = {10.1007/978-3-319-56702-0_3},

year  = {2017},

date = {2017-05-09},

booktitle = {Tools for High Performance Computing 2016},

journal = {Tools for High Performance Computing 2016},

pages = {41-57},

publisher = {Springer},

abstract = {Heterogeneous systems are gaining more importance in supercomputing, yet they are challenging to program and developers require support tools to understand how well their accelerated codes perform and how they can be improved. The OpenMP Tools Interface (OMPT) is a new performance monitoring interface that is being considered for integration into the OpenMP standard. OMPT allows monitoring the execution of heterogeneous OpenMP applications by revealing the activity of the runtime through a standardized API as well as facilitating the exchange of performance information between devices with accelerated codes, and the analysis tool. In this paper we describe our efforts implementing parts of the OMPT specification necessary to monitor accelerators. In particular, the integration of the OMPT features to our parallel runtime system and instrumentation framework helps to obtain detailed performance information about the execution of the accelerated tasks issued to the devices to allow an insightful analysis. As a result of this analysis, the parallel runtime of the programming model has been improved. We focus on the evaluation of monitoring FPGA devices studying the performance of a common kernel in scientific algorithms: matrix multiplication. Nonetheless, this development is as well applicable to monitor GPU accelerators and Intel®; Xeon PhiTM co-processors operating under the OmpSs programming model.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

2016

Rizzo, Antonio; Burresi, Giovanni; Montefoschi, Francesco; Caporali, Maurizio; Giorgi, Roberto

Making IoT with UDOO Journal Article

In: Interaction Design and Architecture(s), vol. 1, no. 30, pp. 95-112, 2016, ISSN: 1826-9745.

Abstract | Links | BibTeX

Modeling Multi-Board Communication in the AXIOM Cyber-Physical System Journal Article

In: Ada User Journal, vol. 37, no. 4, pp. 228-235, 2016, ISSN: 1381-6551.

Abstract | BibTeX

Giorgi, Roberto

Exploring Future Many-Core Architectures: The TERAFLUX Evaluation Framework Book Chapter

In: vol. Advances in Computers (ADV COMPUT), Elsevier, 2016, ISSN: 0065-2458.

Abstract | Links | BibTeX

Giorgi, Roberto; Bettin, Nicola; Gai, Paolo; Martorell, Xavier; Rizzo, Antonio

AXIOM: A Flexible Platform for the Smart Home Book Chapter

In: Keramidas, Georgios; Voros, Nikolaos; Hbner, Michael (Ed.): vol. Springer International Publishing, pp. 57-74, Springer International Publishing, Cham, 2016, ISBN: 978-3-319-42304-3.

Abstract | Links | BibTeX

@inbook{Giorgi2016b,

title = {AXIOM: A Flexible Platform for the Smart Home},

author = {Giorgi, Roberto and Bettin, Nicola and Gai, Paolo and Martorell, Xavier and Rizzo, Antonio},

editor = {Keramidas, Georgios and Voros, Nikolaos and Hbner, Michael},

url = {http://dx.doi.org/10.1007/978-3-319-42304-3_3},

doi = {10.1007/978-3-319-42304-3_3},

isbn = {978-3-319-42304-3},

year  = {2016},

date = {2016-09-24},

journal = {Components and Services for IoT Platforms: Paving the Way for IoT Standards},

volume = {Springer International Publishing},

pages = {57-74},

publisher = {Springer International Publishing},

address = {Cham},

abstract = {The AXIOM hardware/software platform aims at bringing easy programmability on top of a cluster of processors by using a fast interconnect and FPGA as a basis for building a scalable embedded system. The Smart Home is one of the key scenarios in which AXIOM could be useful for the Internet-of-Things (IoT). In Smart Homes, everything is linked to the flow of information that from the on the field devices needs to arrive to the cloud servers. The information sensed in the environment will not be transmitted as is to the higher layers, but is somehow interpreted to provide a synthetic light-weight representation of the environment. In such a scenario, it is then clear that there is a need for peripheral nodes as well as intermediate gateways which needs to be able to perform high-performance computational loads. AXIOM provides the possibility of designing a cluster of low-power/low-budget boards, which could be packed inside a high-performance embedded low-cost product. The AXIOM boards are heterogeneous, thus allowing for even greater diversity which is needed in those kind of IoT scenarios. The cluster itself can then be integrated inside the IoT architectures as computational-power node, which could be the center of a distributed intelligence near the edges of the IoT network.},

howpublished = {Springer International Publishing},

keywords = {},

pubstate = {published},

tppubtype = {inbook}

}

Llort, Germán; eras, Antonio Filgu; ménez-Gonzál ez, Daniel Ji; Servat, Harald; Teruel, Xavier; rcadal, Estanislao Me; z, Carlos Álvare; Giménez, Judit; ell, Xavier Martor; dé, Eduard Aygua; Labarta, Jesús

The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT Proceedings

Springer International Publishing, vol. OpenMP: Memory, Devices and Tasks, 2016.

Abstract | Links | BibTeX

@proceedings{Llort2016,

title = {The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT},

author = {Germán Llort and Antonio Filgu eras and Daniel Ji ménez-Gonzál ez and Harald Servat and Xavier Teruel and Estanislao Me rcadal and Carlos Álvare z and Judit Giménez and Xavier Martor ell and Eduard Aygua dé and Jesús Labarta},

url = {https://link.springer.com/chapter/10.1007/978-3-319-45550-1_16},

doi = {10.1007/97 8-3-319-45 550-1_16},

year  = {2016},

date = {2016-09-21},

volume = {OpenMP: Memory, Devices and Tasks},

publisher = {Springer International Publishing},

abstract = {Heterogeneous systems are an important trend in the future of supercomputers, yet they can be hard to program and developers still lack powerful tools to gain understanding about how well their accelerated codes perform and how to improve them.



Having different types of hardware accelerators available, each with their own specific low-level APIs to program them, there is not yet a clear consensus on a standard way to retrieve information about the accelerator’s performance. To improve this scenario, OMPT is a novel performance monitoring interface that is being considered for integration into the OpenMP standard. OMPT allows analysis tools to monitor the execution of parallel OpenMP applications by providing detailed information about the activity of the runtime through a standard API. For accelerated devices, OMPT also facilitates the exchange of performance information between the runtime and the analysis tool. We implement part of the OMPT specification that refers to the use of accelerators both in the Nanos++ parallel runtime system and the Extrae tracing framework, obtaining detailed performance information about the execution of the tasks issued to the accelerated devices to later conduct insightful analysis.



Our work extends previous efforts in the field to expose detailed information from the OpenMP and OmpSs runtimes, regarding the activity and performance of task-based parallel applications. In this paper, we focus on the evaluation of FPGA devices studying the performance of two common kernels in scientific algorithms: matrix multiplication and Cholesky decomposition. Furthermore, this development is seamlessly applicable for the analysis of GPGPU accelerators and Intel® Xeon PhiTM co-processors operating under the OmpSs programming model.},

keywords = {},

pubstate = {published},

tppubtype = {proceedings}

}

Mazumdar, Somnath; Ayguade, Eduard; Bettin, Nicola; Bueno, Javier; Ermini, Sara; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martinez, Alvarez; Martorell, Xavier; Montefoschi, Francesco; Oro, David; Pnevmatikatos, Dionisis; Rizzo, Antonio; Theodoropoulos, Dimitris; Giorgi, Roberto

AXIOM: A Hardware-Software Platform for Cyber Physical Systems Journal Article

In: pp. 539–546, 2016, ISBN: 978-1-50 90-2817- 7.

Links | BibTeX

Theodoropoulos, Dimitris; Pnevmatikatos, Dionisis; Garzarella, Stefano; Gai, Paolo; Rizzo, Antonio; Giorgi, Roberto

AXIOM: enabling parallel processing in cyber-physical systems. Proceedings Article

In: International Conference on Field-Programmable Logic and Applications, 2016.

Abstract | Links | BibTeX

Alvarez, Carlos; Ayguade, Eduard; Bosch, Jaume; Bueno, Javier; Cherkashin, Artem; Filgueras, Antonio; Jiminez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Vidal, Miquel; Theodoropoulos, Dimitris; Pnevmatikatos, Dionisios N.; Catani, Davide; Oro, David; Fernandez, Carles; Segura, Carlos; Rodriguez, Javier; Hernando, Javier; Scordino, Claudio; Gai, Paolo; Passera, Pierluigi; Pomella, Alberto; Bettin, Nicola; Rizzo, Antonio; Giorgi, Roberto

The AXIOM Software Layers Journal Article

In: "ELSEVIER Microprocessors and Microsystems", 2016, ISSN: 0141-9331.

Abstract | Links | BibTeX

@article{Alvarez2016,

title = {The AXIOM Software Layers},

author = {Carlos Alvarez and Eduard Ayguade and Jaume Bosch and Javier Bueno and Artem Cherkashin and Antonio Filgueras and Daniel Jiminez-Gonzalez and Xavier Martorell and Nacho Navarro and Miquel Vidal and Dimitris Theodoropoulos and Dionisios N. Pnevmatikatos and Davide Catani and David Oro and Carles Fernandez and Carlos Segura and Javier Rodriguez and Javier Hernando and Claudio Scordino and Paolo Gai and Pierluigi Passera and Alberto Pomella and Nicola Bettin and Antonio Rizzo and Roberto Giorgi},

url = {http://www.sciencedirect.com/science/article/pii/S0141933116300850},

doi = {10.1016/j.micpro.2016.07.002},

issn = {0141-9331},

year  = {2016},

date = {2016-07-09},

journal = {"ELSEVIER Microprocessors and Microsystems"},

abstract = {Abstract People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed. The AXIOM project (Agile,  eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model,  leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Giorgi, Roberto

Exploring Dataflow-based Thread Level Parallelism in Cyber-physical Systems Proceedings Article

In: pp. 295-300, ACM, New York, NY, USA, 2016, ISBN: 978-1-4503-4128-8.

Abstract | Links | BibTeX

Scordino, Claudio; Morelli, Bruno

Sharing memory in modern distributed applications Proceedings

2016, ISBN: 978-1-4503-3739-7.

Abstract | Links | BibTeX

Verdoscia, Lorenzo; Giorgi, Roberto

A Data-Flow Soft-Core Processor for Accelerating Scientific Calculation on FPGAs Journal Article

In: Mathematical Problems in Engineering, vol. 2016, no. 1, pp. 1-21, 2016, ISSN: 1563-5147.

Abstract | Links | BibTeX

Burgio, Paolo; Alvarez, Carlos; Ayguadé, Eduard; Filgueras, Antonio; Jiménez-González, Daniel; Martorell, Xavier; Navarro, Nacho; Giorgi, Roberto

Simulating next-generation Cyber-physical computing platforms Journal Article

In: Ada User Journal, vol. 37, no. 1, pp. 59-63, 2016, ISSN: 1381-6551, (TO APPEAR).

Abstract | Links | BibTeX

Mazumdar, Somnath; Giorgi, Roberto

A Survey on Hardware and Software Support for Thread Level Parallelism Journal Article

In: 2016.

Abstract | Links | BibTeX

@article{Mazumdar2016b,

title = {A Survey on Hardware and Software Support for Thread Level Parallelism},

author = {Somnath Mazumdar and Roberto Giorgi},

url = {https://arxiv.org/abs/1603.09274},

year = {2016},

date = {2016-03-01},

abstract = {To support growing massive parallelism, functional components and also the capabilities of current processors are changing and continue to do so. Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process. Further, each core can support multiple, concurrent thread execution. Hence, hardware and software support for threads is more and more needed to improve peak-performance capacity, overall system throughput, and has therefore been the subject of much research. This paper surveys, many of the proposed or currently available solutions for executing, distributing and managing threads both in hardware and software. The nature of current applications is diverse. To increase the system performance, all programming models may not be suitable to harness the built-in massive parallelism of multicore processors. Due to the heterogeneity in hardware, hybrid programming model (which combines the features of shared and distributed model) currently has become very promising. In this paper, first, we have given an overview of threads, threading mechanisms and its management issues during execution. Next, we discuss about different parallel programming models considering to their explicit thread support. We also review the programming models with respect to their support to shared-memory, distributed-memory and heterogeneity. Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used programming models. We also further discuss on software support for threads, to mainly increase the deterministic behavior during runtime. Finally, we conclude the paper by discussing some common issues related to the thread management.

A Survey on Hardware and Software Support for Thread Level Parallelism | Request PDF. Available from: https://www.researchgate.net/publication/301879025_A_Survey_on_Hardware_and_Software_Support_for_Thread_Level_Parallelism [accessed Feb 19 2018].},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

To support growing massive parallelism, functional components and also the capabilities of current processors are changing and continue to do so. Todays computers are built upon multiple processing cores and run applications consisting of a large number of threads, making runtime thread management a complex process. Further, each core can support multiple, concurrent thread execution. Hence, hardware and software support for threads is more and more needed to improve peak-performance capacity, overall system throughput, and has therefore been the subject of much research. This paper surveys, many of the proposed or currently available solutions for executing, distributing and managing threads both in hardware and software. The nature of current applications is diverse. To increase the system performance, all programming models may not be suitable to harness the built-in massive parallelism of multicore processors. Due to the heterogeneity in hardware, hybrid programming model (which combines the features of shared and distributed model) currently has become very promising. In this paper, first, we have given an overview of threads, threading mechanisms and its management issues during execution. Next, we discuss about different parallel programming models considering to their explicit thread support. We also review the programming models with respect to their support to shared-memory, distributed-memory and heterogeneity. Hardware support at execution time is very crucial to the performance of the system, thus different types of hardware support for threads also exist or have been proposed, primarily based on widely used programming models. We also further discuss on software support for threads, to mainly increase the deterministic behavior during runtime. Finally, we conclude the paper by discussing some common issues related to the thread management.

A Survey on Hardware and Software Support for Thread Level Parallelism | Request PDF. Available from: https://www.researchgate.net/publication/301879025_A_Survey_on_Hardware_and_Software_Support_for_Thread_Level_Parallelism [accessed Feb 19 2018].

2015

Giorgi, R.; Scionti, A.

A scalable thread scheduling co-processor based on data-flow principles Journal Article

In: vol. 53, pp. pp. 100–108, 2015, ISSN: 0167-739X.

Abstract | Links | BibTeX

Giorgi, Roberto

Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing Proceedings Article

In: Proceedings of the 13th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2015), 2015.

Abstract | Links | BibTeX

@inproceedings{Giorgi15d,

title = {Scalable Embedded Systems: Towards the Convergence of High-Performance and Embedded Computing},

author = {Roberto Giorgi},

url = {http://www.axiom-project.eu/wp-content/uploads/2016/03/EUC15.pdf},

year  = {2015},

date = {2015-10-20},

booktitle = {Proceedings of the 13th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2015)},

abstract = {Embedded System toolchains are highly customized for a specific System-on-Chip (SoC). When the application needs more performance, the designer is typically forced to adopt a new SoC and possibly another toolchain. The rationale for not scaling performance by using, e.g., two SoCs, is that maintining most of the operations on-chip may allow for higher energy efficiency. We are exploring the feasibility and trade-offs of designing and manufacturing a new Single Board Computer (SBC) that could serve flexibly for a number of current and future applications, by allowing scalability through clusters of SBCs while keeping the same programming model for the SBC. This board is based on FPGAs and embedded processors, and its key points are: i) a fast custom interconnect for board-to-board communication and ii) an easily programmable environment which would allow both the off-loading of code into accelerators (either soft-IP blocks or hard-IP blocks) and, at the same time, the distribution of computation across boards. A key challenge to successfully deploying this paradigm is to properly distribute the threads across several boards without the explicit intervention of the programmer. In this paper we describe how to dynamically and efficiently distribute the computational threads in symbiosis with an appropriate memory model to allow the system scalability, so that we can double the performance by simply connecting two boards without i) changing the basic hardware components (e.g., to a different System-On-Chip) and ii) changing the programming model to follow the vendor specific toolchain. Our approach is to reduce data movement across boards. Our initial experiments have confirmed the feasibility of our approach.},

keywords = {},

pubstate = {published},

tppubtype = {inproceedings}

}

Jimenez-Gonzalez, Daniel; Alvarez-Martinez, Carlos; Filgueras, Antonio; Martorell, Xavier; Langer, Jan; Noguera, Juanjo; Vissers, Kees

Coarse-Grain Performance Estimator for Heterogeneous Parallel Computing Architectures like Zynq All-Programmable SoC Journal Article

In: Second International Workshop on FPGAs for Software Programmers FSP 2015, vol. CoRR, 2015.

Abstract | Links | BibTeX

Alvarez, Carlos; Ayguade, Eduard; Bueno, Javier; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Theodoropoulos, Dimitris; Pnevmatikatos, Dionisios; Catani, Davide; Scordino, Claudio; Gai, Paolo; Segura, Carlos; Fernandez, Carles; Oro, David; Rodriguez-Saeta, Javier; Passera, Pierluigi; Pomella, Alberto; Rizzo, Antonio; Giorgi, Roberto

The AXIOM Software Layers Journal Article

In: DSD 2015, 18th Euromicro Conference on Digital Systems Design (DSD), 2015.

Links | BibTeX

Mondelli, Andrea; Ho, Nam; Scionti, Alberto; Solinas, Marco; Portero, Antoni; Giorgi, Roberto

Dataflow Support in x86_64 Multicore Architectures through Small Hardware Extensions Conference

2015.

Abstract | Links | BibTeX

Theodoropoulos, Dimitris; Pnevmatikatos, Dionisis; Alvarez, Carlos; Ayguade, Eduard; Bueno, Javier; Filgueras, Antonio; Jimenez-Gonzalez, Daniel; Martorell, Xavier; Navarro, Nacho; Segura, Carlos; Fernandez, Carles; Oro, David; Saeta, Javier Rodriguez; Gai, Paolo; Rizzo, Antonio; Giorgi, Roberto

The AXIOM project (Agile, eXtensible, fast I/O Module) Journal Article

In: International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV 2015, 2015.

Abstract | Links | BibTeX

@article{Theodoropoulos2015,

title = {The AXIOM project (Agile, eXtensible, fast I/O Module)},

author = {Dimitris Theodoropoulos and Dionisis Pnevmatikatos and Carlos Alvarez and Eduard Ayguade and Javier Bueno and Antonio Filgueras and Daniel Jimenez-Gonzalez and Xavier Martorell and Nacho Navarro and Carlos Segura and Carles Fernandez and David Oro and Javier Rodriguez Saeta and Paolo Gai and Antonio Rizzo and Roberto Giorgi},

url = {http://samos-conference.com/Resources_Samos_Websites/Proceedings_Repository_SAMOS/2015/Files/SS0_03.pdf},

year  = {2015},

date = {2015-07-21},

journal = {International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation - SAMOS XV 2015},

abstract = {The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power for the assigned tasks, consume the least possible energy for such task (energy efficiency), scale up through modularity, allow for an easy programmability across performance scaling, and exploit at best existing standards at minimal costs. Current solutions for providing enough computational power are mainly based on multi- or many-core architectures. For example, some current research projects (such as ADEPT or PSOCRATES) are already investigating how to join efforts from the High-Performance Computing (HPC) and the Embedded Computing domains, which are both focused on high power efficiency, while GPUs and new Dataflow platforms such as Maxeler, or in general FPGAs, are claimed as the most energy efficient. We present the project’s initial approach, ideas and key concepts, and describe the AXIOM preliminary architecture. Our starting point uses power efficient multi-core nodes, such as ARM cores and FPGA accelerators on the same die, as in the Xilinx Zynq. We will work to provide an integrated environment that supports programmability of the parallel, interconnected nodes that form a CPS system, and evaluate our ideas using demanding test application scenarios.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Burresi, Giovanni; Giorgi, Roberto

A Field Experience for a Vehicle Recognition System using Magnetic Sensors Proceedings Article

In: IEEE MECO 2015, pp. 178-181, 2015, ISBN: 978-1-4799-8999-7.

Abstract | Links | BibTeX

Verdoscia, Lorenzo; Vaccaro, Roberto; Giorgi, Roberto

A matrix multiplier case study for an evaluation of a configurable Dataflow-Machine Proceedings Article

In: ACM CF'15 - LP-EMS, pp. 1-6, 2015, ISBN: 978-1-4503-3358-0.

Abstract | Links | BibTeX

Mondelli, Andrea; Ho, Nam; Scionti, Alberto; Solinas, Marco; Portero, Antoni; Giorgi, Roberto

Enhancing an x86_64 Multi-Core Architecture with Data-Flow Execution Support Proceedings Article

In: Article, ACM 2015 (Ed.): 2015, ISBN: 978-1-4503-3358-0.

Abstract | Links | BibTeX

Giorgi, Roberto

Transactional Memory on a Dataflow Architecture for Accelerating Haskell Journal Article

In: WSEAS Transactions on Computers, vol. 14, pp. 546-558, 2015, ISSN: 1109-2750.

Abstract | Links | BibTeX

Giorgi, Roberto

Accelerating Haskell on a Dataflow Architecture: a case study including Transactional Memory Proceedings Article

In: Proc. Int.l Conf. on Computer Engineering and Applications (CEA), pp. 91–100, Dubai, UAE, 2015, ISBN: 978-1-61804-276-7.

Abstract | Links | BibTeX

2019

2018

2017

2016

2015

AXIOM Project Cookies Policy