An efficient parallel all-electron 4-component Dirac-Kohn-Sham method using a distributed matrix approach