Download CUDA Fortran for Scientists and Engineers: Best Practices by Gregory Ruetsch, Massimiliano Fatica PDF

By Gregory Ruetsch, Massimiliano Fatica

CUDA Fortran for Scientists and Engineers exhibits how high-performance software builders can leverage the ability of GPUs utilizing Fortran, the time-honored language of medical computing and supercomputer functionality benchmarking. The authors presume no previous parallel computing adventure, and canopy the fundamentals besides top practices for effective GPU computing utilizing CUDA Fortran.

To assist you upload CUDA Fortran to current Fortran codes, the e-book explains the right way to comprehend the objective GPU structure, establish computationally in depth components of the code, and alter the code to control the information and parallelism and optimize functionality. All of this is often performed in Fortran, with no need to rewrite in one other language. every one suggestion is illustrated with genuine examples so that you can instantly review the functionality of your code in comparison.
• Leverage the facility of GPU computing with PGI's CUDA Fortran compiler
• achieve insights from participants of the CUDA Fortran language improvement team
• contains multi-GPU programming in CUDA Fortran, protecting either peer-to-peer and message passing interface (MPI) approaches
• contains complete resource code for the entire examples and several other case reviews
• obtain resource code and slides from the book's better half website

Show description

Read or Download CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming PDF

Similar programming books

Beginning Perl (3rd Edition)

This can be a booklet for these people who believed that we didn’t have to examine Perl, and now we all know it's extra ubiquitous than ever. Perl is intensely versatile and strong, and it isn’t scared of internet 2. zero or the cloud. initially touted because the duct tape of the net, Perl has considering the fact that developed right into a multipurpose, multiplatform language current totally far and wide: heavy-duty net functions, the cloud, structures management, normal language processing, and monetary engineering.

Extra resources for CUDA Fortran for Scientists and Engineers: Best Practices for Efficient CUDA Fortran Programming

Sample text

2 Profiling Asynchronous Events . . . . . . 2 Device Memory . . . . . . . . . . . . 1 Declaring Data in Device Code . . . . . . . 2 Coalesced Access to Global Memory . . . . . . 1 Misaligned Access . . . . . . . . 2 Strided Access . . . . . . . . . 3 Texture Memory . . . . . . . . . . . 4 Local Memory . . . . . . . . . . . 1 Detecting Local Memory Use (Advanced Topic) . 5 Constant Memory . . . . . . . . . . 3 On-Chip Memory .

3 Texture Memory . . . . . . . . . . . 4 Local Memory . . . . . . . . . . . 1 Detecting Local Memory Use (Advanced Topic) . 5 Constant Memory . . . . . . . . . . 3 On-Chip Memory . . . . . . . . . . . . 1 L1 Cache . . . . . . . . . . . . 2 Registers . . . . . . . . . . . . 3 Shared Memory . . . . . . . . . . . 2 Shared Memory Bank Conflicts . . . . . 4 Memory Optimization Example: Matrix Transpose . .

4 6 4 ] c p u t i m e =[ 3 1 2 7 . 9 9 1 ] m e m t r a n s f e r h o s t m e m t y p e =[ 1 ] m e t h o d =[ m e m c p y D t o H ] g p u t i m e =[ 2 5 0 1 . 3 1 2 ] c p u t i m e =[ 2 5 5 5 . 0 0 0 ] m e m t r a n s f e r h o s t m e m t y p e =[ 1 ] ✝ where a value of 0 for memtransferhostmemtype indicates pageable memory and a value of 1 indicates pinned memory. Pinned memory should not be overused, since excessive use can reduce overall system performance. How much is too much is difficult to tell in advance, so, as with all optimizations, test the applications and the systems they run on for optimal performance parameters.

Download PDF sample

Rated 4.51 of 5 – based on 9 votes