We've recently discovered a new method to simulate the power of a GPU using only the CPU. This algorithm is called the 'DFTW method' and was originally proposed by Peter B. d'Eaault and Yaroslav V. Teplov. Their paper, "GPU Accelerated Numerical Computation of the Diffusion Kernel via Conjugate Gradients," is available online for free at: [1] (download links are for versions of the CUDA SDK prior to 2.5, the current version is 3.1. This change was made between the first posting and this posting. The first post included a version 2.5 link that was only for the GPU implementation.) I am working on a system that needs to make many large calculations (billions of calls to the convolution and correlation functions) so using GPUs in a GPU / CUDA implementation is very appealing. I cannot afford the cost of multiple cards in my system, so I was looking for a way to use just one card in the system and somehow implement the GPU algorithm on the CPU in CUDA. Here is the outline of our simulation algorithm: 1.) We create a big (100,000 element) random vector and multiply it by a random (100,000 x 100,000) matrix to get the response. Because of the speed gained in using this 'single-GPU' method, I would like to take advantage of it by computing about 4-5 or more million calculations before terminating the program (each time we re-initialize the random state). 2.) Initialize the random state: 1a) We create an array of random numbers that is the same size as our (100000 x 100000) matrix A. 1b) We initialize this random state using a single (n x n x 512) DFTW3D for the X direction and we repeat this calculation for the Y and Z directions. This operation will actually consume about 4-5 million calculations. 3.) The calculations of the convolution for the entire matrix go as follows: 2a) We multiply our (n x n x 512) DFTW3D kernel by the random initial state we created in part 1. This operation will actually consume about 5 million calculations. 2b) We initialize the first set of (n x n x 512) DFTW3D operations to A. This will occupy another 5 million calculations (all of which take less than a second). 2c) We multiply the (n x n x 512) dFTW3D kernel times the (n x n x 512) DFTW3D with the first set of (n x n x 512) DFTW3D kernel to create the first DFTW3D step of our convoluted response. 2d) We multiply the (n x n x 512) DFTW3D kernel with the convolved matrix from the previous DFTW3D step. This is all that is necessary for this entire process. 4.) Then we need to do this process 5 more times in order to obtain the Nth convolution step. We plan to use part of this system to model a GPU that has only a CPU with no graphics processing unit. We would also like to make use of the fact that GPUs perform single precision calculations much more efficiently. The DFTW method as implemented on the GPU uses 16 or 32 bit floating point. Because the work is all done on the CPU, I will need to figure out how to interface the CPU and GPU using OpenMP. I assume there is some API to execute code on the GPU, but I am not sure of any other details. At the moment I have not explored GPU programming in CUDA. I do have the original paper and as many papers and papers as I can find about the DFTW method but my code needs to be functional in its current state. Thank you very much for any help. E.J. Chau > From: Mark Hoemmen > Date: Mon, 20 Jan 2003 14:11:59 -0500 > To: "EJChau@ACD.com" > Subject: Re: [VOTE] GPU Accelerated Numerical Computation of > the Diffusion Kernel via Conjugate Gradients > > On Mon, Jan 20, 2003 at 02:10:46PM -0500, EJ Chau wrote: > > Can someone tell me what GPU implementation of the DFTW method > > would be the fastest? Would it be in part because the DFTW > > method is more amenable to CPU computing rather than GPU computing? > > > > Thanks. > > > > ------------------------------------------------------------------------- > > | This message was sent by Mark S. Hoemmen from | > > | personal email to the wxPython developers mailing list | > > | Using wxPython is appreciated! Please | > > | don't use the list unless you really have something to say. | > > | ----------------------------------------------------------------------| > > > > _______________________________________________ > > wxPython-users mailing list > > wxPython-users@... > > https://lists.sourceforge.net/lists/listinfo/wxpython-users > > __________________________________________________________________________ The best in 802.11 for your small business or home office. Allows you to connect your office wired or wirelessly and has the most security features. Download the new P-210 Wireless Station for all you 802.11a/g capable laptops, desktop, or USB products, as well as 802.11b products. http://www.pronetadvisors.com/freenr.asp?id=108&dest=826 ----------------------------------------------------------------------------- --- You are currently subscribed to wxPython-developers as: "J.Beghtol@ACD.com" To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@... -- Aaron ____________________________________________________________ Aaron R ---- You are currently subscribed to wxPython-developers as: hendrix@... To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@... -- You are currently subscribed to wxPython-developers as: hendrix@... To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@... -- You are currently subscribed to wxPython-developers as: hendrix@... To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@... -- You are currently subscribed to wxPython-developers as: hendrix@... To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@... -- You are currently subscribed to wxPython-developers as: hendrix@... To unsubscribe send a blank email to wxPython-developers-request@... To subscribe send a blank email to wxPython-developers-request@...