A+ A A-

Azequia Azequia

A Thread-Based Implementation of the Message Passing Interface (MPI-1) Standard



AzequiaMPI is a thread-based full implementation of the Message Passing Interface (MPI-1) standard. It is being built through the knowledge gained in the study of the code and design of MPICH2, Open MPI and other implementations, like TOMPI, TMPI, MPC-MPI, and several more. The most inportant characteristic of the implementation is that built every MPI node as a thread, and not as an operating system process.

AzequiaMPI has three main layers:

  • The kernel, called IDSP or Azequia, provides point to point communication and RPC-based group and thread management. Implements two versions, a BLK kernel implementing a concurrency model based on POSIX 1003.1c threads primitives, and a LFQ kernel, implementing synchronization through non-blocking lock-free structures. The kernel can be used directly from applications, avoiding MPI interface.
  • INET, or Network Interface, that implements network facilities. It is under development, based on the study and experiences with the Open MPI BTLs. Our goal is provide TCP/IP and Infiniband facilities.
  • MPI Interface, upon two layers above, provides MPI semantics to applications.

AzequiaMPI layers design

AzequiaMPI uses other tools:

  • Process Manager Interface (from MPICH2) through MPD (Multi-Purpouse Daemon). It is used for launching applications in cluster environments.
  • HWLOC (HardWare Locality) from Open MPI project. It is used for binding threads to processors.
  • Etc.


Design model

AzequiaMPI implements an MPI endpoint as a thread, launching a container process in each shared memory machine. Figure shows current implementation models of MPI (P is process with a rank, T is a process with a rank and th is a thread without MPI weight).

Process-based MPI

Process-based MPI

Hybrid model

Hybrid model

Thread-based MPI

Thread-based MPI

Current work

1. Improving several aspects of the implementation:

  • Collective operations
  • Static Variables issues through pre-compiling and other techniques
  • Datatype and packing in shared memory thread communication
  • Adding MPI-2 features supported by threads (e.g. One Sided Communication)
  • Profiling tools

2. Adding network (TCP/IP and Infiniband) support to the implementation.

3. AzequiaMPI is now being ported to a TCP/IP FPGA cluster on top of hardware cores PowerPC.


AzequiaMPI runs on different platforms:

  1. Linux based clusters.
  2. Heterogeneous DSPs/FPGAs multicomputers from Sundance company.
  3. Xilkernel/Petalinux operating systems on top of Microblaze software processor in Xilinx FPGAs.
  4. Texas Instruments C6000 DSPs on top of DSP/BIOS. For implementing Pthread semantic upon DSP/BIOS semaphores we developed OSI (Operating System Interface).

>You can get latest versions from the download section.


AzequiaMPI performance has been compared to other implementations, process and thread based. Latency and bandwidth in shared memory take advance of thread common address space. A set of figures are presented below, showing snapshots of the development.

Microbenchmarks performance

July 2011. We are studying one-copy mechanisms for communicating processes in shared memory multicore computers, and algorithms from some implementations, in an Intel Nehalem dual socket quad-core E5620 (shared memory) with 12 MBytes cache size. We explore SMARTMAT, LiMIC2, and KNEM under Open MPI and MPICH2, as well as other thread based implementations like MPC-MPI.

Figures show results from initial research in latency and bandwidth for point to point communication and some collectives with one-to-all and all-to-one patterns. Note that we use binding for ranks to processors in round robin (rank i to core i, not SMT), collectives are executed with 8 ranks. Note as well different color order in figures:

Netpipe pingpong latency Netpipe pingpong bandwidth
IMB broadcast latency IMB broadcast bandwidth
IMB Reduce bandwidth IMB Gather bandwidth

We are working in designing and implementation of new algorithm for taking advance of thread in shared memory for collective communication.

Euroben Matrix Product Performance

July 2011. Next figures show Euroben benchmark performance measurements. Euroben benchmark does a matrix product using different communication mechanisms provided by MPI. Benchmark gives some weight to communication time, so expect an improvement for implementations of MPI with higher communication performance.

All tests are run with 8 ranks in a Intel Nehalem (E5620) two-socket quad-core machine, with 12 MB of L3 cache. All ranks are bind to processors (rank i to processor i without SMT).

Next figures shows a matrix product using standard communication (MPI_Send/MPI_Recv) of data between tasks. Also relative performance compared to AzequiaMPI is showed. X-axis shows the square matrix side size. A table with bytes sended for each matrix side size is provided.

Side Size Bytes Size
16 256 B
32 1 KB
64 4 KB
128 16 KB
256 64 KB
512 256 KB
1024 1 MB
2048 4 MB
4096 16 MB
Euroben !D Matrix Product (standard) Euroben 1D Matrix Multiplication (standard) Relative

Next figures shows a matrix product where data between tasks is communicated by MPI broadcast collective operation. Relative performance compared to AzequiaMPI is showed as well. All parameters are the same as figures above.

Euroben !D Matrix Product (broadcast) Euroben 1D Matrix Multiplication (broadcast) Relative

HPL and NAS benchmarks performance

July 2011. Next figures show realistic performance measurements through High Perfomance Linpack (HPL) and NAS Parallel Benchmarks (NPB). Low communication weight in these benchmarks result in no virtual differences between implementations.

Both benchmarks are run in an Intel Nehalem dual-socket quad-core machine (E5620) with 12 MB of L3 cache. 8 tasks run with binding of tasks to processors (rank i to processor i without SMT).

Althought we choose NPB SP multizone benchmark, flat MPI is used, with 8 ranks and no OpenMP threads.

HPL run with following parameters: N=20.000, NB=16, PxQ=2x4 and 1-ring broadcast.

NPB SP Multizone class D HPL 2x4 1-ring


Additional information

We are interested in High Performance Computing software research and development. The International Exascale Software Project group has defined a roadmap for coordinating efforts from the international open source software community to create a software environment to exploit exascale systems in next years. We are interested in aspects from this roadmap and their development inside our projects.

1. We are studying influence of software in power consumption. Our goal is find and apply software techniques for saving energy in supercomputers. Initially, we are working in evaluating blocking and busy waiting in synchronization and communication, using respectively BLK and LFQ versions of the AzequiaMPI kernel. We are collaborating with CenitS/Computaex in this field.

CenitS - Computaex logo

2. We are interested in applications using MPI. Our goal is to improve AzequiaMPI performance and scalability for supporting real world applications efficiently in different platforms.

3. Some HPC useful links are:

a) PRACE. Partnership for Advanced Computing in Europe
b) HPC-Europe2. Pan-Eurpean Research Infraestructure for High Performance Computing
c) IESP. International Exascale Software Project
d) PlanetHPC. Research and Invertigation Roadmap for High Performance Computing in Europe

PRACE logo HPC-Europa2 logo IESP logo PlanetHPC logo



We are writting an User and Installtion Guide for AzequiaMPI. Coming soon ...

Related papers



Please, read the license information before download AzequiaMPI.


You can download latest version under development from the svn repository: http://gim.unex.es/svn/azqmpi/trunk. It can be accessed by web at: http://gim.unex.es/websvn/azqmpi. If you want to get involved and apply changes, please, send an This email address is being protected from spambots. You need JavaScript enabled to view it. to get an account.

Please, if you have any comments or any proposal about AzequiaMPI or related software, please, contact us by forum or by This email address is being protected from spambots. You need JavaScript enabled to view it..

A wiki has been created (most in Spanish) for internal documentation about development and using platforms and software.


The latest stable release for Linux multicore (not yet clusters) can be downloaded from here:

2014, June 10th 2.3.0


Downloads: 6

Quick Ref. Guide

Downloads: 14

2014, January 29th 2.2.4 AzequiaMPI
Downloads: 11
Readme.txt Changelog.txt


We have developed a version on FPGAs upon PetaLinux/Microblaze:

2010, Dec 10th 2.1.2_fpga Azequia
Downloads: 33
Readme.txt Changelog.txt
2010, Dec 10th
1.4.0_fpga AzequiaMPI
Downloads: 28
Readme.txt Changelog.txt


We have ported AzequiaMPI to DSP multicomputers. A version for a multicomputer SMT310Q from Sundance Company, with DSPs C64x (SMT361, SMT361A, etc.) of Texas Instruments is available in the Azequia page.


Neither the University of Extremadura, nor any of their employees, makes any
warranty express or implied, or assumes any legal liability or responsibility
for the accuracy, completeness, or usefulness of any information, apparatus,
product, or process disclosed, or represents that its use would not infringe
privately owned rights.


Juan Carlos Díaz Martín

Image juancarl (At) unex (dot) es
PhD in Computer Science
Dpto. de Tecnología de los Computadores y las Comunicaciones


Juan Antonio Rico Gallego


jarico (At) unex (dot) es
Computer Science
Assistant professor
Dpt. Engineering of Informatic and Telematic Systems
Phone: 0034 927 257251


Jesús María Álvarez Llorente


llorente (At) unex (dot) es
Computer Science
M.Sc. (Advanced Studies Diploma)
Área de Lenguajes y Sistemas Informáticos






  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
Prev Next

¿Qué es la Estrategia de Especialización Inteligen…

Tue, 20 Aug 2013

¿Qué es la Estrategia de Especialización Inteligente en Investigación e Innovación (RIS3) para Extremadura?

La estrategia RIS3 viene de las siglas en inglés “Research and Innovation Smart Specialisation Strategy”. Está relacionado con temas de I+D+i (Investigación, Desarrollo e Innovación) y básicamente se trata de priorizar estrategias de especialización a nivel de regiones en toda...

Read more

TFM en el MUI-TINC de Daniel Caballero

Wed, 17 Jul 2013

TFM en el MUI-TINC de Daniel Caballero

En esta convocatoria de julio de 2013, Daniel Caballero ha presentado hoy por la mañana su Trabajo Fin de Máster (TFM) en el máster MUI-TINC (Máster Universidatrio de Investigación, especialidad de Tecnologías Informáticas y de las Comunicaciones), el cual está...

Read more

Máster de Ingeniería Informática: Empleabilidad

Thu, 20 Jun 2013

Máster de Ingeniería Informática: Empleabilidad

Según se indica en el VERIFICA del Máster de Ingeniería Informática en la UEx (Universidad de Extremadura) aprobado por la ANECA (Agencia Nacional de Evaluación de la Calidad y Acreditación). En definitiva, el Máster en Ingeniería en Informática es impartido conforme...

Read more

Lectura de tesis doctoral de Luis Manso

Thu, 20 Jun 2013

Lectura de tesis doctoral de Luis Manso

Hoy jueves 20 de junio nuestro compañero Luis Manso va a presentar su tesis doctoral en el salón de grado a las 11:00. El titulo de su trabajo "Perception as Stochastic Grammar-based Sampling on Dynamic Graph Spaces". El objetivo de este trabajo...

Read more

Pre-presentaciones de la asignatura Sistemas de In…

Fri, 07 Jun 2013

Pre-presentaciones de la asignatura Sistemas de Información

Las presentaciones previas de los trabajos a presentar en la asignatura Sistemas de Información en la 3ª promoción del Máster en Ingeniería Informática de los 3 másteres TIC (@masteresTicEpcc) en la Escuela Politécnica (@EPCC_Unex) se realizaron la semana pasada en...

Read more

EuroMPI 2013

Wed, 05 Jun 2013

EuroMPI 2013

Entre el 15 y 18 de septiembre de 2013 se va ha celebrar el Congreso anual EuroMPI 2013, en Madrid, con la presencia de integrantes del grupo GIM (Juan Carlos Díaz y Juan A. Rico) que presentarán un trabajo titulado...

Read more

Vídeos de la Feria Tecnológica en la Escuela Polit…

Wed, 05 Jun 2013

Vídeos de la Feria Tecnológica en la Escuela Politécnica de Cáceres

Por primera vez la Escuela Politécnica de Cáceres (@EPCC_Unex) se animó a organizar una Feria Tecnológica (#FeriaTicEpcc) con motivo de la semana del centro. Se realizó a principios de este mes el lunes 6 de mayo por la mañana y por la tarde (el...

Read more

Acto de Graduación del IES Javier García Téllez. C…

Tue, 04 Jun 2013

Acto de Graduación del IES Javier García Téllez. Conferencia: “Ciudades Inteligentes”

La semana pasada, concretamente el viernes 24 de mayo, fuimos invitados a través de la Escuela Politécnica de Cáceres a realizar la lección magistral del acto de graduación del IES Javier García Téllez en el Complejo Cultural San Francisco: “Ciudades...

Read more

EPCC_Unex Talento cacereño al frente del mayor puente del país [Enlace corregido] - http://t.co/BeKPNzlnid
EPCC_Unex Talento cacereño al frente del mayor puente del país - http://t.co/x1MrVRmj2R
EPCC_Unex RT @GeotechTips: A esta le dicen la "Big Daddy" y es la excavadora más grande del mundo. #minería vía Civil Engineers http://t.co/XRIj0se
Oct 11replyretweet

Sign In or Create Account