Journal of Applied Science and Engineering

Published by Tamkang University Press


Impact Factor



P. K. Mishra This email address is being protected from spambots. You need JavaScript enabled to view it.1

1Department of Applied Mathematics, Birla Institute of Technology, Mesra, Ranchi  835215, India


Received: March 22, 2004
Accepted: October 8, 2004
Publication Date: March 1, 2005

Download Citation: ||  


We present a technique for finding a lower bound on the number of processors needed to achieve a given schedule of an algorithm represented as a dag for a Diophantine system. The application of this technique is illustrated with a tensor product computation. We then provide a time-minimal processor schedule that meets the computed processor lower bounds, including the one for tensor product. We produce a formula for the number of lattice points in the convex polyhedron that are scheduled for a particular time step (which is a lower bound on the number of processors needed to satisfy the schedule). This is done by constructing a system of parametric linear Diophantine equations whose solutions represent the lattice points of interest. Our principal contribution to lower bounds in the algorithm and its implementation for constructing the generating function from which a formula for the number of these solutions is produced.

Keywords: Diophantine System, Systolic Algorithm, Generating Function, DAG


  1. [1] Li, G. J. and Wah, B. W., “The Design of Optimal Systolic Algorithms,” IEEE Trans. Comput, C-34, pp. 6677 (1985).
  2. [2] Ibarra, O. H. and Palis, M., “VLSI Algorithms for Solving Recurrence Equations and Applications,” IEEE Trans. on Acoust. Speech, and Signal Processing, ASSP-35, pp. 10461064 (1987).
  3. [3] Moldovan, D. I., “On the Design of Algorithms for VLSI Systolic Arrays,” Proc. IEEE, Vol. 71, pp. 113120 (1983).
  4. [4] Louka, B. and Tchuente, M., “An Optimal Solution for Gauss-Jordon Elimination on 2D Systolic Arrays,” In John, V. McCanny, John McWhirter and Earl, E. Swartzlander Jr., editors, Systolic Array Processors, Prentice-Hall, Killarney, Ireland, pp. 264274 (1989).
  5. [5] Hirschberg, D. S. “Recent Results on the Complexity of Common-subsequence Problems,” In David Sankoff and Joseph, B. Kruskal, editors, Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading, MA, U.S.A. (1983).
  6. [6] Karrp, R. M.; Miller, R. E. and Winograd, S., “Properties of a Model for Parallel Computations: Determinacy, Termination, Queueing,” SIAM J. Appl. Math, Vol. 14, pp. 13901411 (1966).
  7. [7] Kung, H.-T. and Leiserson, C. E., “Algorithms for VLSI Processor Arrays,” In Introduction to VLSI Systems, Addison-Wesley, Menlo Park, CA, U.S.A. pp. 271292 (1980).
  8. [8] Lamport, L., “The Parallel Execution of Do-Loops,” Comm. of the ACM, Vol. 17, pp. 8393 (1974).
  9. [9] Wilde, D. K., “A Library for Doing Polyhedral Operations,” Master’s thesis, Corvallis, Oregon, 1993. Also published as IRISA technical report PI 785, Rennes, France (1993).
  10. [10] Krishnamurthy, E. V.; Kunde, M.; Schimmler, M. and Schroder, H., “Systolic Algorithm for Tensor Products of Matrices: Implementation and Applications,” Parallel Computing, Vol. 13, pp. 301308 (1990).
  11. [11] Stanley, R. P., “Linear Homogeneous Diophantine Equations and Magic Labelings of Graphs,” Duke Math. J., Vol. 40, pp. 607632 (1973).
  12. [12] Benaini, A. and Robert, Y., “Space-time-minimal Systolic Arrays for Gaussian Elimination and the Algebraic Path Problem,” Parallel Computing, Vol. 15, pp. 211225 (1990).
  13. [13] Clauss, P.; Mongenet, C. and Perrin, G. R., “Calculus of Space-optimal Mappings of Systolic Algorithms on Processor Arrays,” In Proc. Int. Conf. on Application Specific Array Processors, IEEE Computer Society, Princeton, NJ, U.S.A. pp. 418 (1990).
  14. [14] Aho, A. V.; Hopcroft, J. E. and Ullman J. D., “The Design and Analysis of Computer Algorithms,” Addisonwesley, Reading, MA, U.S.A. (1974).
  15. [15] Wong, Y. and Delosme, J.-M., “Optimization of Processor Count for Systolic Arrays,” Dept. of Computer Sci. RR-697, Yale Univ., U.S.A. (1989).
  16. [16] Wong, Y. and Delosme, J.-M., “Space-optimal Linear Processor Allocation for Systolic Array Synthesis,” In Prasanna, V. K. and Canter, L. H., editiors, Proc. 6th Int. Parallel Processing Symposium, IEEE Computer Society Press, Baverly Hills, CA, U.S.A. pp. 275282 (1992).
  17. [17] Shang, W. and Fortes, J. A. B., “Time Optimal Linear Schedule for Algorithms with Uniform Dependencies,” IEEE Transactions on Computers, Vol. 40, pp. 723742 (1991).
  18. [18] Darte, A.; Khachiyan, L. and Robert, Y., “Linear scheduling is Close to Optimal,” In José Fortes, Edward Lee, and Teresa Meng, editors, Application Specific Array Processors, IEEE Computer Society Press, August pp. 3746 (1992).
  19. [19] Krishnamurthy, E. V. and Schroder, H., “Systolic Algorithm for Multivariable Approximation Using Tensor Products of Basis Functions,” Parallel Computing, Vol. 17, pp. 483492 (1991).
  20. [20] MacMahon, P. A., “The Diophantine Inequality x 'y,” In George E. Andrews, editor, Collected Papers, Vol. I, Combinatorics, The MIT Press, Cambridge, MA, U.S.A. pp. 12121232(1979).
  21. [21] Garey, M. R. and Johnson, D. S., “Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman, San Francisco, CA, U.S.A. (1979).
  22. [22] Quinton, P., “Automatic Synthesis of Systolic Arrays from Uniform Recurrent Equations,” In Proc. 11th Ann. Symp. on Computer Architecture, pp. 208214 (1984).
  23. [23] Rajopadhye, S. V.; Purushothaman, S. and Fujimoto, R. M., “On Synthesizing Systolic Arrays from Recurrence Equations with Linear Dependencies,” In K. V. Nori, editor, Lecture Notes in Computer Science, No. 241: Foundations of Software Technology and Theoretical Computer Science, Springer Verlag, New Delhi, India pp. 488503 (1986).
  24. [24] Sahni, S., “Computational Related Problems,” SIAM J. Comput., Vol. 3, pp. 262279 (1974).
  25. [25] Clauss, P. and Loechner, V., “Parametric Analysis of Polyhedral Iteration Spaces,” Journal of VLSI Signal Processing, Vol. 19, pp. 179194 (1998).
  26. [26] Granata, J.; Conner, M. and Tolimieri, R., “Recursive Fast Algorithm and the Role of the Tensor Product,” IEEE Transactions on Signal Processing, Vol. 40, pp. 29212930 (1992).
  27. [27] MacMahon, P. A., “Note on the Diophantine Inequality x 'y,” In George E. Andrews, editor, Collected Papers, Vol. I, Cambinatorics, MIT Press, Cambridge, MA, U.S.A. pp. 12331246 (1979).