Lawyers Committee for Human Rights - Home Page Back to  Main Section
PROGRAMS
|
ABOUT US
| CONTRIBUTE |
MEDIA ROOM
|
SEARCH:  
Yardsticks for Workers Rights:
Learning from Experience


Elements of Reliability
Accuracy
Replicability
Verifiability
Value as indicator  

Do recorded measurements for a particular workplace give a true picture of what is actually going on there?Are they a trustworthy guide to whether labor standards are actually being met, and workers' rights actually being observed?This is the core issue for workplace measurements.  

Measurement reliability is more complicated than simple truth or falsehood. Not only can measurements be false, but even completely accurate measurements can be incomplete, misleading or off the point. A particular measurement, or set of measurements, needs to be assessed for reliability on several different grounds.  

There is no magic formula for testing reliability. Below is an outline that lays out the major elements involved. The purpose of the outline is not to try to be definitive, but to encourage reliability questions to be approached systematically. In focusing on one kind of potential unreliability, such as the risk that monitors will be lied to or that workers will not tell what they know,it is easy to overlook others that are less intuitively obvious but just as significant.  

Field testing of measurements would be extremely useful - taking a particular set of measurements under normal monitoring circumstances, and then comparing the recorded data to "actual" conditions in the same workplace using much more intensive onsite observation over time. However, no field testing results are available (at least to the public) for the measurements used by any current practitioner. Any field testing being undertaken at this stage is very likely to be kept confidential. If and when field testing begins to take place in public, it will need to address each of the elements of reliability discussed below. 

Accuracy  

Recorded measurements can be factually wrong, for any of several reasons. 

i. Deception. Observers can be lied to, shown false records, or otherwise intentionally misled by managers or by workers who have been coached on what to say in interviews. They can also be misled by workers who are afraid to tell them the truth. [1]  

ii. Bias.Observers can be predisposed to shade facts in one direction or another, either for institutional reasons (e.g., who pays their salaries or controls their hiring; other functions they are also required to perform, such as mediation of disputes at the factory) or reasons that are individual to the particular monitor (e.g., personal bias).  

iii. Misperception.Unbiased observers who are not being intentionally misled can still fail to observe and register the facts correctly. Inexperience, lack of training, lack of preparation,insufficient time, and insufficient access to the particular workplace are among the possible reasons. 

iv.Imprecision.Measurement units can be couched in vague enough terms that a clear result does not convey clear factual content (e.g. "Is there freedom of association in this workplace?""Yes."). In addition, even with precise units of measurement (e.g., "width of aisles"), actual measurement can be imprecise.  

Replicability

A measurement result that is fully accurate at the time it was taken can be still be unreliable as a measurement of actual conditions, because the result will not be the same when taken at another time, or by another observer. The reasons for potential unreliability are different in the case ofquantitative measurements and qualitative measurements. 

i.Quantitative results.Although apparently objective and concrete, some facts documented by an observer can quickly change, so that what the observer sees at one time would not necessarily be the same at other times (e.g., unlocked exit doors; clean toilets; workers leaving at the end of normal working hours). In other words, a facility may "dress up" for the monitor's visit, and then "dress down" again. 

ii. Qualitative results.The statements of workers or others selected for interviews or polling, even if in response to standardized questions (e.g., "Can you refuse overtime work without penalty?"), may or may not be a representative sample of what others would say, or of workplace opinions generally. Random selection and adequate sample size are familiar tools of qualitative research, but they can be hard to apply in factory monitoring, where one or a few whistleblowers [2] are more likely to be the source of information about serious problems than the results of a statistically significant poll. 

Verifiability

An accurate measurement, replicable by other observers, is still limited in its reliability by the nature of the monitoring process, which depends on workplace outsiders and relatively short on-site observation time with long intervals in between. Problems related to subtle intimidation, for example, may be particularly difficult to pick up in short monitoring visits. When a measurement result can be tested in ways that are external to the process that gathered it, it is more likely to be reliable. Even if the verification test has its own vulnerabilities, it adds reliability as long as its vulnerabilities are different from those of the initial monitoring process. 

i.Cross-checking with external facts. Facts from sources other than the workplace monitoring process, used to cross-check particular measurement results from the monitoring process, can be either objective(e.g., employee turnover rates; union signup rates) or subjective (e.g., the opinions oflocal union organizers or social workers outside the factory).  

ii.Transparency.The more that results can be seen by those in a position to spot errors, the higher the chance that errors will caught and corrected. Knowing that potential critics will see the results also creates an incentive for scrupulous measurement in the first place. Transparency can be either selective (e.g., monitoring results shown to workers themselves, or to a neutral overseer) or general (i.e., results shown to the public, preferably on a website to insure easy access). 

Value as indicator of compliance or violation

Finally, an accurate, replicable, and verifiable piece of data may still not be a reliable indicator of compliance with a code standard, because it is not sufficiently relevant or sensitive to the core issue being measured. For example, the design of a factory's grievance process canbe measured for adequacy, and the number and severity of grievances received by it canbe checked. But the fact that no serious grievances had been received might, or might not, indicate that no serious violations of code standards were occurring in that factory.Some measurement units can be assembled into indicators that correlate much more closely with compliance or violation ofcode standards. For example, if all workers have official birth certificates available for inspection, and certificates cannot be easily forged or borrowed, it is unlikely that child labor will be a problem. But indicator value needs to be considered case by case. Three main factors contribute.  

i.Relevance to core issue. The unit of measurement needs to be closely enough related to the standard so that a particular measurement result, by itself or in combination with other measurement results, correlates well with either compliance or violation of the standard. For example, the fact that all local laws are being complied with may offer little assurance that a code standard for freedom of association is being met. Often,measurement units in combination can provide a degree of reliability that each unit by itself could not (e.g., time records from a punch clock, together with verifiable reports from workers that no off-the-clock work is performed).  

ii.Comparability.Units of measurement that can be used to compare one workplace's performance with another's, or to track changes in conditions at the same workplace from one time interval to the next (e.g., the average wage increase for entry-level workers after one year on the job; or the percentage of female managers compared to percentage of female workers), are more valuable than measurements that are unique to a particular time or place (e.g., the approval rating for a special-purpose workers' committee). 

iii. Gradations.A measurement unit with a range of possible results (e.g., percentages or numbers; subjective rankings expressed on a scale of 1 to 10) has higher value than a measurement with only a yes/no outcome. [3] Yes/no measurements are by far the most common in current measurement practice. [4]



Endnotes

[1] See Monitoring for more extensive discussion.

[2] The importance of whistleblowers, and measurement issues raised by them, is discussed in Monitoring under Current weaknesses.

[3] The database allows quantitative measurements to be searched separately for those expressed in binary (yes/no) form, numerical form, scale form, etc.

[4] Of more than 2,000 quantitative measurements in the approximately 80% are in binary (yes/no) form.


U.S. Law & Security | Asylum in the U.S. | Human Rights Defenders | Human Rights Issues | International Justice |
International Refugee Policy | Workers Rights | Media Room | About Us | Contribute | Jobs | Contact Us | Publications | Search | Site Map | Home