Home FAQs General FAQ

General FAQ

General questions about 3L Diamond

Are Diamond communications polled?

E-mail Print PDF

In a word, no. Diamond never uses polling for I/O transfers. Whenever a thread has to wait for a channel transfer, it is descheduled and other threads continue to execute. The waiting thread will resume when the completion of the transfer is signalled by an interrupt.

The mistaken idea that Diamond polls probably derives from an incorrect statement in a 2006 article by M. Raulet et al, Rapid Prototyping for Heterogeneous Multicomponent Systems: An MPEG-4 Stream over a UMTS Communication Link (EURASIP Journal on Applied Signal Processing Volume 2006, Article ID 64369, 1-13, DOI 10.1155/ASP/2006/64369).

This article stated that '[in 3L Diamond,] data transfers are realized using DMA, but without any computation parallelism which is nearly equivalent to polling technique.' This is simply a misunderstanding of how Diamond works. A thread that is waiting for a channel transfer is indeed blocked and performs no more computation until the transfer has completed, but other threads continue to run, providing the 'computation parallelism' that is claimed to be absent. There is no polling (or its equivalent) in Diamond communications.

 

Why do I need 3L Diamond?

E-mail Print PDF

This is rather like asking, why do I need a high-level language? The strict answer is that you can make do without tools to help you build applications, but you will be making your task an order of magnitude harder.  If you decide not to use 3L Diamond to build a multiprocessor application you will have to do all the work yourself, including things like:

  • loading all the processors, including ones remote from the host PC
  • keeping control of all the separate modules needed to load your application
  • starting your application in a synchronised way
  • managing communications, possibly including deadlock-free message routing
  • writing your own device drivers
  • explicitly managing all memory allocation
  • writing your own multithreading support or using something that is likely to be less efficient than Diamond's
  • inventing a host communication mechanism
  • being prepared to rewrite your source if the configuration changes
  • being prepared to make major changes if the underlying hardware changes
  • supporting multiple source versions for all hardware and configuration variations
  • handling all of the underlying hardware peculiarities (unexpected cache behaviour, for example)
  • ... and many more.

Diamond can do all of this for you. 3L works closely with hardware vendors and puts a great deal of time, effort, and experience into optimising all aspects of the system to give you the best results.

See the more complete article here.

 

 

What makes 3L Diamond different?

E-mail Print PDF
There are many features that make Diamond significantly different from other tools that developers try to use for multiprocessor applications. Some of the more significant features are:
  • Diamond was designed from the beginning for highly-optimised multiprocessor applications and has been used successfully on platforms with from one to over one thousand processors. It is not a single-processor system that makes claims to support multiprocessors or one that has been stretched to try to do things that far exceed its original constraints.

  • Diamond has a coherent model that is used to describe multiprocessor systems and transparently supports heterogeneous systems made from differing processor types. It is not an ad hoc collection of bits and pieces that you have to fit together as best you can.

  • The Diamond model works equally well on DSPs and FPGAs. Even users with little or no knowledge of FPGAs can use the impressive parallel performance they offer to accelerate parts of their applications.

  • The Diamond multiprocessor compiler is unique. It constructs your application after having been presented with all of the required components and so can gather information from them to detect many opportunities for in-processor and system-wide optimisations.

  • Diamond loads the whole of the processor network for you; you do not have to struggle loading each processor individually.

  • Diamond's flexibility is unsurpassed. You can easily change your hardware, the number and type of the processors, and the topology of your system without needing to change your code. Building variant applications with tasks moved from one processor to another, even from DSPs to FPGAs, is truly trivial.

  • Diamond takes over many of the boring and error-prone housekeeping operations that are of little real interest to you but essential to get a multiprocessor application working. That leaves you free to concentrate on your core skills and algorithms.

 

Do I need other software? Which versions in particular?

E-mail Print PDF

Diamond allows you to build tasks for DSPs and FPGAs.  The source code for these tasks is translated by tools which you need to have installed.  These tools include compilers for DSPs and GPPs, and bitstream generators for FPGAs. Commonly these will be the TI compiler, linker and assembler which you get as part of Code Composer Studio, and the Xilinx ISE tools.  These tools must be obtained separately.

Diamond supports these software tool versions:

Diamond Release Supported CCS version Supported Xilinx ISE Foundation version
3.1.10 3.3 9.2
3.2.2 3.3 11.4
4.0 4.0 11.4 onward

There are some optional tools that you can get from third parties to assist building Diamond tasks, for example Impulse Co-Developer; these can increase productivity but are not essential.

 

Will it run faster if I do it myself?

E-mail Print PDF

You can nearly always make any program run faster if you have the time, knowledge, and experience, but you have to ask yourself:

What is the "it" I am going to do myself?

What is it that Diamond does that you are going to do yourself because you think you can get a faster result?

  • Is it loading all the processors? Do you know how to arrange that large sections of initialisation code do not waste valuable memory? Diamond does.

  • Is it device drivers? Diamond's device drivers been highly optimised over many years of working directly with the manufacturers and incorporate many non-obvious techniques that have proved to improve throughput.

  • Is it the kernel? The Diamond kernel has been written by experts with many years experience in maximising performance and optimising code. It is under constant scrutiny to find any places where improvement is possible. That said, in the vast majority of real applications, most of the CPU time is be spent in user algorithms and not in any code provided by Diamond.

Why do I think I can make it run faster than using Diamond?

Performance comes from using the best algorithms. Often the actual implementation of those algorithms has little effect on performance. For example, no matter how tightly you code a bubble sort, even in assembler, it isn't going to out-perform a simple implementation of quicksort on a reasonable number of data. If you have the best algorithm, where do you think your "faster" is going to come from? What part of Diamond do you know is inefficient? Are you certain?

  • Will it be worth it? Is any improvement I might get going to be worth the considerable time and effort I will have to spend to do everything myself when Diamond can do much of the work automatically?

  • Am I prepared to lose flexibility? "Doing it yourself" often translates to hoping for some speed improvement at the likely cost of making your code more obscure and building in assumptions about the processors, topology, and general requirements of the application. If you write another application on the same hardware or move the current one to different hardware, you'll probably have to repeat significant parts of the development.

 
  • «
  •  Start 
  •  Prev 
  •  1 
  •  2 
  •  3 
  •  Next 
  •  End 
  • »
Page 1 of 3

Try Diamond Now!

DemoSystem1

Did You Know?

You can ask for memory from the heap to be aligned on a particular boundary using memalign.