IABSE TaskGroup 3.1 has themandate to define reference results for the validation of methodologies and programs used to study both stability and buffeting responses of long-span bridges. These tools for the simulation of the aeroelastic behaviour are fundamental in the safe design of bridges and they should be validated. The working group decided to set up a benchmark procedure consisting of several steps to define reference results for this validation. For each step, contributors use their own methodology to simulate the bridge behaviour using the same input data. All the results are then compared, and reference values are defined through statistical analysis. The benchmark procedure is considered as a three-step problem with substeps of increasing difficulty: Step 1 compares numerical results only, Step 2 is validation against wind tunnel experiments, and Step 3 is validation against fullscale data. In this paper, the contributions and the reference results of the simplest initial substep (1.1a) are presented. It consists of the simulation of the aeroelastic response of a two-degree-of-freedom bridge deck section, with analytical aerodynamic coefficients, forced by turbulent wind. Despite the problem’s simplicity, differences in some contributions are significant, confirming the necessity of having solid references to validate software programs.