* Derivation of male and female title-by-status units (assumes hempst and wempst are single digit). compute hbst=hocc*10 + hempst. compute wbst=wocc*10 + wempst. * Derivation of major group where it is known that the five major groups are * defined as the first digit of the four digit occupational code for occupations * coded 1001 to 4999, and all occupations coded 5000 or above are the 5th major group. compute h1gp=trunc(hocc/1000). compute w1gp=trunc(wocc/1000). recode h1gp w1gp (5 thru hi=5). * Derivation of minor group unit where it is known that the minor group * is the first three digits of the four digit occupational title (as with the ISCO codes) . compute h3gp=trunc(hocc/10). compute w3gp=trunc(wocc/10). * Derivation of major and minor group-by-status units. compute h1gpst=h1gp*10 + hempst. compute w1gpst=w1gp*10 + wempst. compute h3gpst=h3gp*10 + hempst. compute w3gpst=w3gp*10 + wempst.
2.2 Translating from a 'case' to a 'table' file: SPSS syntax
get file="casedata.sav". compute hbst=hocc*10 + hempst. compute wbst=wocc*10 + wempst. * run some tables for info :. tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hocc + wocc) /statistics count ((F5.0) ' Cases ') /title "Occupational titles". tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hempst + wempst) /statistics count ((F5.0) ' Cases ') /title "Occupational employment status". tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hbst + wbst) /statistics count ((F5.0) ' Cases ') /title "Occupational title by status". * these tables will show the maximum range of hbst and wbst, for instance 1 - 100000. * However, that range is often too big for SPSS to be able to export a crosstabulation on the 'raw' data values. * A solution is to first autorecode the base unit values. autorecode var=hbst wbst /into=thbst2 twbst2. tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (thbst2 + twbst2) /statistics count ((F5.0) ' Cases ') /title "Occupational title by status". *the max range of the autorecoded variables will probably be a lot lower, eg 1 to 1000. sav out="mtch1.sav". procedure output outfile="temp1.dat". crosstabs variables = thbst2 (1,1000) twbst2 (1,1000) /tables=thbst2 by twbst2 /write=all. * this writes out the table file to a plain data file; the next command reads it back into spss. data list file="temp1.dat" /freq 12-20 thbst2 21-28 twbst2 29-35. * the numbers indicate the column locations in the original data, which are set by default in SPSS to those given. sort cases by thbst2 twbst2. select if (freq gt 0). * save this then match it with the original case file. sav out="mtch2.sav" /keep=thbst2 twbst2 freq. get file="mtch1.sav". sort cases by thbst2 twbst2. match files file=* /table="mtch2.sav" /by=thbst2 twbst2. compute first=1. if ( (lag(thbst2) = thbst2) & (lag(twbst2)=twbst2) ) first=0. select if (first=1). * this selection retains only one relevant table file component per combination. tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hbst + wbst) /statistics count ((F5.0) ' Cases ') /title "Occupational title by status". * all the different values are represented but the frequencies are not correct : the total is just the total number of combinations represented. weight by freq. tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hbst + wbst) /statistics count ((F5.0) ' Cases ') /title "Occupational title by status". weight off. * after weighting, the results are the same as previously from the case file. compute pidid=$casenum. * this creates an indentifier variable which can be of use at later stages of the process. sav out="tabledata.sav" /keep=pidid freq hbst wbst hocc wocc hempst wempst [{h/w}{1/2/3/..}gp{st}] .
2.3 Occupational unit 'value labels': SPSS syntax
2.3.3 CONTENTS OF THE INCLUDE FILE "versionlabels.sps" :
* Title only units :. define occlab (occ=!enclose('{','}')) . add value labels !occ 1110 "1110 Legislators" 1120 "1120 Senior government officials" 1130 "1130 Traditional chiefs and heads of villages" 1140 "1140 Senior officials of special interest organizations" 1200 "1200 Corporate managers" 1210 "1210 Directors and chief executives" 1220 "1220 Production and operations department managers" 1221 "1221 Production and operations department managers in agriculture, hunting, forestry and fishing" 1222 "1222 Production and operations department managers in manufacturing" 1223 "1223 Production and operations department managers in construction" etc "etc" . !enddefine. * Status only units :. define stlab (occ=!enclose('{','}')) . add value labels !occ 1 "1 Self-Employed no employees" 2 "2 Self-Employed with employees" etc "etc" 9 "9 Unknown". !enddefine. * Title-by-status units. define bstlab (occ=!enclose('{','}')) . add value labels !occ 11101 "11101 Semp0 Legislators" 11401 "11401 Semp0 Senior officials of special interest organizations" etc "etc" 11102 "11102 Semp1+ Legislators" 11402 "11402 Semp1+ Senior officials of special interest organizations" etc "etc" etc "etc" etc "etc" 11109 "11109 UnkwnSt Legislators" 11409 "11409 UnkwnSt Senior officials of special interest organizations" etc "etc" . !enddefine. * Occupational major groups :. define majlab (occ=!enclose('{','}')). add value labels !occ 1 "1 Legislators, senior officials and managers" 2 "2 Professionals" etc "etc". !enddefine. * Other relevant groups in same style to be added further...
2.3.4 Call the relevant macros: SPSS syntax
get file="tabledata.sav". include file="versionlabels.sps". occlab occ={hocc wocc}. stlab occ={hempst wempst}. bstlab occ={hbst wbst}. majlab occ={h1gp w1gp}. etc. etc. * check the results:. weight by freq. tables /ftotal=ftot1 "Total" /tables(labels) + ftot1 by (hocc + wocc) /statistics count ((F5.0) ' Cases ') /title "Occupational title by status". *etc etc for other units. weight off. sav out="tabledata.sav".
2.4 'Square autorecoded' values: SPSS syntax
get file="tabledata.sav". * male units only :. sav out="mtch1.sav" /keep=pidid hocc hempst hbst h{1/2/3/..}gp{st} . * female units only:. compute pid2=pidid+1000000. * (assumes no more than (arbitrary) 1000000 h-w combinations, would need bigger number otherwise). sav out="mtch2.sav" /keep=pid2 wocc wempst wbst w{1/2/3/..}gp{st} /rename (pid2=pidid) ( wocc wempst wbst w{1/2/3/..}gp{st} = hocc hempst hbst h{1/2/3/..}gp{st} ) . *add female units to file of male units :. add files file="mtch1.sav" /in=one /file="mtch2.sav" /in=two /by=pidid. *autorecode the two together :. autorecode var=hocc hempst hbst h{1/2/3/..}gp{st} /into= hocc2 hempst2 hbst2 h{1/2/3/..}gp{st}2 . sav out="temp1.sav". sort cases by pidid. select if (one=1). sav out="mtch3.sav" /keep=pidid hocc2 hempst2 hbst2 h{1/2/3/..}gp{st}2 . get file="temp1.sav". sort cases by pidid. compute pid3=pidid-1000000. select if (two=1). sav out="mtch4.sav" /keep=pid3 hocc2 hempst2 hbst2 h{1/2/3/..}gp{st}2 /rename (pid3=pidid) ( hocc2 hempst2 hbst2 h{1/2/3/..}gp{st}2 = wocc2 wempst2 wbst2 h{1/2/3/..}gp{st}2 ) . * return to original file and match on recoded data :. get file="tabledata.sav". match files file=* /file="mtch3.sav" /file="mtch4.sav" /by=pidid. * Assess the autorecoded variables :. weight by freq. tables /format blank missing ('.') /ftotal=ftot1 "Total" /tables (labels) + ftot1 by (hocc2 + wocc2) /statistics count ((F5.0) ' Cases ') /title="Husband and wife title". tables /format blank missing ('.') /ftotal=ftot1 "Total" /tables (labels) + ftot1 by (hempst2 + wempst2) /statistics count ((F5.0) ' Cases ') /title="Husband and wife employment status". tables /format blank missing ('.') /ftotal=ftot1 "Total" /tables (labels) + ftot1 by (hbst2 + wbst2) /statistics count ((F5.0) ' Cases ') /title="Husband and wife title-by-status". tables /format blank missing ('.') /ftotal=ftot1 "Total" /tables (labels) + ftot1 by (h{1/2/3/..}gp{st}2 + w{1/2/3/..}gp{st}2) /statistics count ((F5.0) ' Cases ') /title="Husband and wife {major / submajor / minor /.. }group{-by-status}" . weight off. * (These tables will reveal the new number of autorecoded units * represented in each file).Return to Preparing input data
Last modified 14 February
2002
This
document is maintained by
Paul Lambert (paul.lambert@stirling.ac.uk)