In fact, only of the lower house, the Sejm. During its current term, 460 deputies voted 5545 times since the 8th of November 2011. The Sejm website publishes the individual results of the votes as PDF files. I put the results into an SQLite database with an unholy hodgepodge of shell scripts, wget, pdftohtml, and Python. Here is a sample query and its output:

SELECT deputy, party, voting, result FROM VotingResults JOIN Deputies USING(deputy_id) JOIN Parties USING(party_id) JOIN Votings USING(voting_id) JOIN Results USING(result_id) ORDER BY random() LIMIT 5; HOK MAREK | PO | data/glos_32_10.html | NAY MILLER LESZEK | SLD | data/glos_49_42.html | AYE MASŁOWSKA GABRIELA | PiS | data/glos_18_14.html | ABSTAIN ROMANEK ANDRZEJ | SP | data/glos_93_58.html | AYE ŁOPATA JAN | PSL | data/glos_57_65.html | ABSENT

The database registers as many as 509 deputies since some of them died during the term or were appointed or elected to other posts. The deputies are grouped into parties according to their latest status. There are data on 5539 votings. I skipped 6 elections of speakers whose PDF layout was different.

Using Python and numpy, we can put the results into a matrix **R** with 5539 rows and 509 columns. Its columns assigned to deputies are vectors in the 5539-dimensional space of voting results. The elements of **R** are +1 for AYE, −1 for NAY, and 0 for ABSTAIN and ABSENT. Shift each row to make its mean equal to zero, keeping ABSTAIN and ABSENT unbiased at zero:

M = R.sum(axis=1) / numpy.fabs(R).sum(axis=1) MT = M[:, numpy.newaxis] Rcentred = numpy.where(R != 0, R - MT, 0)

Then perform the Singular Value Decomposition of **R**_{centred}:

U, S, VT = numpy.linalg.svd(Rcentred, full_matrices=False)

**U** holds the so-called left-singular vectors of **R**_{centred} along which the variance of **R**_{centred} is the largest possible. **S** contains the so-called singular values of **R**_{centred}. We do not use **V**^{T}.

We have just carried out the Principal Component Analysis of **R**. Thanks to it, we can visualize 5539 dimensions of **R** in a much smaller number of dimensions. In our case, three initial left-singular vectors contribute 70.1%, 8.5%, and 3.6% of the total variance of **R**, respectively. The projection of the vectors of **R**_{centred} along its three initial left-singular vectors looks like this:

The x axis, usually called the partisan axis, corresponds to the division between the government (PO+PSL) and the opposition (everybody else). The votings with the highest absolute weights in the first left-singular vector concern the involvement of the government in the Amber Gold scandal: this (−0.0216) and this (+0.0217).

The y axis appears to correspond to right wing-left wing polarization (everybody else versus SLD+RP). Its most significant votings are the rejection of a proposal to waive the law against insulting religious feelings (−0.0451) and the proposal to raise the taxes on copper and silver mining (+0.0447).

The z axis, in turn, seems to correspond to sentiments towards the European Union (everybody else versus PiS+RP+ZP aka SP). Its most significant votings concern two alignments between Polish and EU law: this (−0.0463) and this (+0.0470).

Finally, here are the votes of individual deputies projected on the xy, xz, and yz planes.

**EDIT**: here is a zoomable version, made with the Bokeh visualization library. Thanks for the tip, stared!

You can find the source code at https://bitbucket.org/mciura/sejm

LikeLike

Świetne – od kilku lat chciałem to zrobić, ale zawsze było odkładane „na później”. Bardzo fajne wyniki.

LikeLiked by 1 person

Voting data is public and should be made available by Sejm in some more script-friendly format. You did a great job parsing votes out of this crap.

LikeLike

January Weiner did a similar analysis (in Polish) for the previous term of the Sejm.

LikeLike

Interesujace bo pokazuje ze PiS glosuje en bloc bardziej niz inne partie. Na przyklad SLD jest bardzo rozstrzelone w ‘z’: wyraznie czesc jest pro-Europejska, a czesc anty-.

LikeLike

Czy centrowanie wierszy s zachowaniem zera jest legalne? Zmienia się przy takiej operacji wartość absolutna dyspersji wierszu.

LikeLike

Rozumiem zastrzeżenie. Moim zdaniem wynik takiego centrowania jest zupełnie w porządku. To jego wartości wejściowe (+1, −1, 0) traktuję jako tymczasowe, bez większego sensu fizycznego.

LikeLike