The continuous progress in semiconductor technology allows for more and more complex processor architectures. The downside of these technological advances is that computing has already hit a power wall and clock frequencies can barely be increased. In order to scale computing performance in the future, systems' energy efficiency and the degree of parallelism have to be significantly improved. The design of heterogeneous hardware with different specialized resources seems to be a promising solution. When highest performance (throughput, short latencies) and energy efficiency are important, as a remedy, we consider the generation of dedicated FPGA accelerators to address these stringent requirements. In this work, we present the PARO high-level synthesis framework for the automated generation of massively parallel FPGA accelerators. The framework is tailored for compute-intensive applications from the domains of image, video, and other digital signal processing, as well as algorithms from linear algebra. Unique features of PARO include: (1) The design entry in form of a compact and intuitive domain-specific language that is closely related to a mathematical problem description, (2) support for integer, fixed point, floating point, and custom arithmetic, (3) advanced loop transformations (e.g., partitioning) and scheduling techniques in the polyhedron model, (4)generation of accelerator IP cores (VHDL code) that can be easily integrated into a system design such as an SoC or in a networked scenario. Finally, we showcase the capabilities of our framework for the development of a range image conditioning pipeline for smart cameras for range sensing.