M.COM. P-ASSOCIATION OF ATTRIBUTES
##################################
ASSOCIATION OF ATTRIBUTES
#################################
Association of Attributes
This is a well-known truth that statistical methods deal with quantitative facts only. The quantitative data may arise in two different ways :
(1) The characteristics which are capable of being measured quantitatively are termed as statistics of variables (numerical classification); for instance, height, weight, wages, length, income, expenditure, etc.
(2) The characteristics which are not capable of quantitative measurement, are termed as statistics of attributes (descriptive classification). There are certain phenomena like blindness, dumbness, deafness, literacy, etc., which cannot be measured directly. In such cases their presence or absence only can be studied. The quantitative character in such cases arise solely by counting. For example, we can say that an item is defective or not defective, a candidate appeared for an examination may pass or fail.
Thus, statistics of variables is based on numerical character and the statistics of attributes is based on descriptive character. So far we have dealt with the classification, summarisation, interpretation and correlation of the statistics of variables. In this chapter, we will discuss the method of determining if there exists any relation between different attributes. The purpose of the study of association of attributes is to find out whether a given attribute shows association with the other. For example, if there is any association between, say sickness and medication, those medicated should not suffer as much from sickness as those not medicated. If this can be checked up in comparative study of the state of health of a set of people dividing them into those who are administered and those not administered the medicine and see effect on the two classes of persons, the association between sickness and medication can be established.
Classification of Data in case of Statistics of Attributes
In the analysis of statistics relating to attributes, the first thing is the classification of data. Here data are classified on the basis of presence or absence of particular attributes. In the simplest case, if only one attribute is being studied, then only two mutually exclusive classes are formed : one of those observations in which the attribute is present and the other of those in which the attribute is not present. For instance, when we study the literacy of a village, then the population of the village is divided into two classes-one class of people who are literate and the other class who are not literate. Such a classification is termed as “classification by Dichotomy”. actual analysis usually there are more than two classes in which the universe is need and such classification is called manifold classification. For example, take attribute literacy along with criminality. Then, the classes will be literate, not treat, criminal, not criminal, literate not criminal, literate criminal, criminal not negate and not literate not criminal.
Here it may be noted that it is absolutely essential that a clear cut definition must be laid down of the various attributes under study.
Notation and Terminology : For the sake of convenience in analysis, it is necessary to use certain symbols to represent different classes and their frequencies. Usually capital letters, A, B and C etc., are used to denote the presence of attributes and the Greek letters, a, ß and y etc., are used to denote absence of these attributes respectively. ‘N’ denotes the universe or the number of items.
Combination of Attributes : The combination of attributes is denoted by grouping together of the concerned letters, e.g., AB is the combination of the attributes A and B. Thus, for the attributes A (illiteracy) and B (smoking), AB would mean both the attributes of illiteracy and smoking. Similarly, AB will represent illiteracy and non-smoking; aB, literacy and smoking and aß, literacy and non-smoking.
If a third attribute is included to represent, say male sex, then ABC will stand for illiterate males who are smokers. Similarly, ABY, ABC, ABy, etc., can be understood.
Class Frequency: The number of units in different classes are called “class frequencies”. Class frequencies are denoted by enclosing the class symbols by brackets. Thus (AB) would represent the frequency of the class AB.
Classes : Different attributes in themselves are called classes. If the attribute is present, it is termed as positive class; and its contı y or opposite is known as negative class. The positive class, in which the attribute is present, is denoted by capital letters, A, B, C etc. The negative class in which the aiuibutes is absent is denoted by small Greek letters a (Alpha), B (Beta), y (Gamma). In the place of Greck letters small letters, a, b, c etc., can be used. For example, A represents the attribute of literacy, then a represents the absence of literacy. If B represents criminality, then B represents absence of criminality: If C represents blindness, then y represents absence of blindness. N will denote the universe.
Order of Classes : The order of classes depends upon the number of attributes under study. A class represented by n attributes is called a class of nth order and the corresponding frequency as the frequency of the nth order. The class in which only one attribute is considered is called the class of the first order. The class in which two attributes are considered is called the class of the second order. The class in which three attributes are considered is called the class of the third order; and so on.
Thus, A, B, C, , .y are classes of Ist order. AB, Aß, ab, as, AC, Ay GC, ay, BC, By, BC, By are classes of 2nd order. ABC, ABY, ABY, ABC, are classes of 3rd order. aBC, By, abc, aby
N which represents the total number of items is taken as a class of ‘0’ order as no attribute is specified for it.
Ultimate Classes : The classes of highest order are called the ultimate classes and their frequencies are called the ultimate class frequencies. For example, in case of classification on population on the basis of two attributes, the 2nd order will be the highest order and in case of classification of population on the basis of three attributes, the third order will be the highest order.
The number of ultimate classes is represented by 2″, where n stands for the number of attributes under study. Thus, for one attribute, there will be 21 = 2 classes: for two attributes, the ultimate classes will be 22 = 4: if three attributes, the ultimate classes will be 2 = 8 and so on.
Total Number of Classes : The total number of classes depends upon the number of attributes under study. The total number of classes is always equal to 3′ where n stands for the number of attributes. If only one attribute is studied, then, there will be 3′ or only three classes (N. A. a). If the number of attributes is two, represented by A and B the total number of classes (including N) would be 9. They would be NA, B.a, b, AB. Ab, aB and ab, Thus, when there are three attributes the total number of classes would be 33 or 27. If the number of attributes is 4. the total number of classes would be 34 or 81.
The following table gives the class frequencies of all orders and the total number of all class frequencies upto 3 attributes :
Association Attributes Study Material
Association Attributes Study Material
Determination of Unknown Class Frequencies
From a set of given class frequencies, to find the remaining class frequencies one has to first write the inter-relation between them and then substitute the known values.
In classifying statistical data according to attributes the following simple rule should be kept in mind. Any class frequency can always be expressed in terms of class frequencies of bigger order. Thus, the frequencies of first order can be expressed in terms of the frequencies of the second order which in turn can be expressed in terms of frequencies of the third order and so on. On the basis of this rule we can set up various types of relationships between the frequencies of different orders. If there is one attribute only, represented by A, the frequency of the universe or N can be divided into two classes (A) and (a). Thus,
the data can be numerically expressed, the method of correlation is employed to find out the relationship existing between the two variables. If it is desired to investigate the relationship between the data of a descriptive character-known as attributes, the method of association is resorted to.
Statistically, two attributes are associated if they occur together in a greater number of cases than expected had they been independent. If there is no association between two attributes A and B, we expect to find the same proportion of A’s amongst the B’s as amongst the Non-B’s.
Difference Between Association and Correlation
Though both ‘Association’ and ‘Correlation are used to study the relationship between two or more variables, but both are entirely different from each other. Following are the main differences between the two :
(1) The term correlation is applied to the study of relationship between two or more variables where they can be quantitatively measured, while the term association refers to the study of relationship between such variables which cannot be measured in terms of figures.
(2) Association can be “Total’ or ‘Partial’, but correlation can be simple, multiple or partial.
Types of Association
There can be three kinds of association between attributes :
(1) Positive Association : When two attributes are present or absent together in the data, it is known as positive association. In such cases, the actual frequency is more than the expected frequency. Such association is found between literacy and employment, smoking and cancer, vaccination and immunity from a disease, ctc.
(2) Negative Association: When the presence of an attribute is associated with the absence of the other attribute, it is called negative association. In such cases, the actual frequency is less than the expected frequency. Such association is found between vaccination and attack of a disease, cleanliness and ill-health, etc.
(3) Independence. When there exists no association between two attributes or when they have no tendency to be present together or the presence of one attribute does not affect the other attribute, the two attributes are regarded as independent. In such a situation, the actual frequency is equal to the expected frequency.
Methods of Determining Association
There are three methods of studying association : .
(1) Method of Comparison of Observed and Expected Frequencies.
(2) Method of Comparison of Proportion.
(3) Coefficient of Association.
(1) Method of Comparison of Observed and Expected Frequencies
Under this method, the actual number of observations is compared with the expected ones. Thus, in order to find out association between two attributes, it becomes necessary to find out the expected number of their simultaneous occurrence. According to the theory of Probability the expectation of a particular event is : Number of Favourable Cases
EXAMINATION QUESTIONS
1 What is meant by association ? Distinguish between ‘Association and ‘Correlation and explain the theory and measurement of association of attributes.
2. Define association of attributes. How would you calculate it ? Discuss the usefulness of coefficient of association in analysing the economic statistics
3. What do you understand by Association of Attributes ? What is meant by consistency of data ? How will you examine the consistency of data Classical according to different attributes ? Illustrate with examples.
4. Distinguish between the concepts of ‘correlation’, ‘regression’ and ‘association of attributes and describe the situation in which each of these should be used. Illustrate your answer with examples.
5. Define association of attributes. How would you calculate it?
6. What do you understand by ‘Association of Attributes’? How will you examine the consistency of data classified according to different attributes ?
7. What do you mean by consistency of data ? How will you examine it ? Write the conditions of consistency in case of two attributes.
8. Write short notes on the following:
(i) Class frequencies
(ii) Ultimate class frequencies
(iii) Consistency of Data
(iv) Total Association
(v) Partial Association
(vi) Illusory Association
(vii) Coefficient of Colligation
(viii) Coefficient of Association
9. Bring out clearly the difference between total’ and ‘partial’ association.
10. Write a short note on the use of coefficient of association in analysing the economic statistics.
Short Answer Theoretical Questions
1 What do you understand by association of attributes ?
2. Distinguish between correlation and association of attributes.
3. What do you mean by consistency of data ? How will you examine it?
4. Differentiate between association, disassociation and independence.
5. Describe the most commonly used method for determining the association btween two attributes.
6. Explain the method of comparison of proportion to find out the association of attributes.
7. Explain the ‘Partial Association’.
8. Discuss the ‘Illusory Association’.
9. Explain the ‘Nine-Square Table’.
10. Discuss the Coefficient of Association.
Very Short Answer Questions
1 Write down the formula of Yule’s Coefficient of Association.
2. Give a format of nine-square table.
3. Which method is used to find out the relative measure of association between two attributes ?
4. When data are to be deemed consistent ?
5. Which type of basis of classification is used to find out the association of attributes?
Objective Type Questions
Fill in the blanks :
1 Association of attribute can be ………… or
2. If the frequency of any class is
3. ………. ……, then data will be consider inconsistent. derived a formula to find the relative measure of association between two attributes.
4. when observed and expected frequencies are equal, then both attributes are
5. To find the association of attributes, facts are classified on the basis of
6. Association of attributes is find out among the …………. lacis.
Ans. (1) Positive, Negative, (2) Negative (3) Yule, (4) Independent, (5) Attributes, (6) Qualitative
State whether the following statements are ‘true’ or ‘false’ :
1 If the frequency of any class is not negative, then data are to be deemed consistent.
(True)
2. Association of attributes is found out between quantitative facts.
(False)
3. The Association of attributes are always positive.
(False)
4. When observed frequency of attributes is less than their expected frequency, then association of attribute will be negative.
(True)
5. When observed and expected frequencies of attributes are equal, then attributes will be independent.
(True)
6. Association of attributes can be simple, multiple and partial.
(False)
7. The Percentage Method is used to find the relative measure of association between two attributes.
(False)
Select the correct option :
1 Data are to be considered inconsistent, if any ultimate class frequency is :
(a) Positive
(b) Negative
(c) Zero
(d) None of these
2. Association of attributes can be :
(a) Positive
(b) Negative
(C) Independent
(d) Any of these
3. To measure the degree of association of attributes, formula was derived by:
(a) Fisher
(b) Yule
(c) Karl Pearson
(d) None of these
4. Main methods of determining association are :
(a) Two
(b) Four
(c) Three
(d) Five
5. In case of positive association :
(a) AB = AXB N
(b) AB < AXB N
(C) AB > AXB
(d) None of these
6. Which of the following methods is used to find out the relative measure of association between two attributes ?
(a) Method of Comparison of Observed and Expected Frequencies
(b) Method of Comparison of Proportion
(c) Coefficient of Association
(d) None of these
Ans. 1. (b), 2. (d), 3. (b), 4. (c), 5. (c), 6. (C).
##################################
DR. PRAVEEN KUMAR-9760480884
##################################
टिप्पणियाँ
एक टिप्पणी भेजें