Smart Learner

Unit 1 Introduction Research Research: Definition, Meaning, Importance Types and Qualities of Research Research is defined as a careful consideration of study regarding a particular concern or a problem using scientific methods. According to the American sociologist Earl Robert Babbie, “Research is a systematic inquiry to describe, explain, predict and control the observed phenomenon. Research involves inductive and deductive methods.” Meaning:- Research comprises “creative and systematic work undertaken to increase the stock of knowledge, including knowledge of humans, culture and society, and the use of this stock of knowledge to devise new applications.” It is used to establish or confirm facts, reaffirm the results of previous work, solve new or existing problems, support theorems, or develop new theories. A research project may also be an expansion on past work in the field. Research projects can be used to develop further knowledge on a topic, or in the example of a school research project, they can be used to further a student’s research prowess to prepare them for future jobs or reports. To test the validity of instruments, procedures, or experiments, research may replicate elements of prior projects or the project as a whole. The primary purposes of basic research (as opposed to applied research) are documentation, discovery, interpretation, or the research and development (R&D) of methods and systems for the advancement of human knowledge. Approaches to research depend on epistemologies, which vary considerably both within and between humanities and sciences. There are several forms of research: scientific, humanities, artistic, economic, social, business, marketing, practitioner research, life, technological, etc. Types of Research 1. Basic Research Basic research is mostly conducted to enhance knowledge. It covers fundamental aspects of research. The main motivation of this research is knowledge expansion. It is a non-commercial research and doesn’t facilitate in creating or inventing anything. For example, an experiment is a good example of basic research. 2. Applied Research Applied research focuses on analyzing and solving real-life problems. This type of research refers to the study that helps solve practical problems using scientific methods. This research plays an important role in solving issues that impact the overall well-being of humans. For example, finding a specific cure for a disease. 3. Problem Oriented Research As the name suggests, problem-oriented research is conducted to understand the exact nature of the problem to find out relevant solutions. The term “problem” refers to having issues or two thoughts while making any decisions. For e.g Revenue of a car company has decreased by 12% in the last year. The following could be the probable causes: There is no optimum production, poor quality of a product, no advertising, economic conditions etc. 4. Problem Solving Research This type of research is conducted by companies to understand and resolve their own problems. The problem-solving research uses applied research to find solutions to the existing problems. 5. Qualitative Research Qualitative research is a process that is about inquiry that helps in-depth understanding of the problems or issues in their natural settings. This is a non- statistical research method. Qualitative research is heavily dependent on the experience of the researchers and the questions used to probe the sample. The sample size is usually restricted to 6-10 people in a sample. Open-ended questions are asked in a manner that one question leads to another. The purpose of asking open-ended questions is to gather as much information as possible from the sample. Following are the methods used for qualitative research:- • One-to-one interview • Focus groups • Ethnographic Research • Content/ Text Analysis • Case study research 6. Quantitative Research Qualitative research is a structured way of collecting data and analyzing it to draw conclusions. Unlike qualitative research, this research method uses a computational, statistical and similar method to collect and analyze data. Quantitative data is all about numbers. Quantitative research involves a larger population as more number of people means more data. In this manner, more data can be analyzed to obtain accurate results. This type of research method uses close-ended questions because, in quantitative research, the researchers are typically looking at measuring the extent and gathering foolproof statistical data. Online surveys, questionnaires, and polls are preferable data collection tools used in quantitative research. There are various methods of deploying surveys or questionnaires. In recent times online surveys and questionnaires have gained popularity. Survey respondents can receive these surveys on mobile phones, emails or can simply use the internet to access surveys or questionnaires. Qualities of Research • Empirical: based on observations and experimentation on theories. • Systematic: follows orderly and sequential procedure. • Controlled: all variables except those that are tested/experimented upon are kept constant. • Employs hypothesis: guides the investigation process • Analytical: There is critical analysis of all data used so that there is no error in their interpretation • Objective, Unbiased, & Logical: all findings are logically based on empirical • Employs quantitative or statistical methods: data are transformed into numerical measures and are treated statistically. Research Application in Functional Area of Business Part of a business’ growth is the deployment of separate departments which functions with specific focus and definitive path. They are structured according to certain business requirements and these departments will vary depending on the type of business being practiced. Knowing the different functional areas of a business is a basic but major necessity for an entrepreneur especially when he’s still in the planning stage. “Functional Areas” is defined as the grouping of activities or processes on the basis of their need and wants in accomplishing one or more tasks. It’s also an alternative term for business unit. Let’s dive right into the list: 1. Human Resource Human resource is the most important asset in the business. The heart of an organization lies on its people. Without people, the day-to-day operation of a business would cease to function. The success of a business relies fully on the hands of the employees working in the company. In order to achieve the company’s goals and objectives, the company’s Human Resource Department is responsible in recruiting the right people with the required skills, qualifications and experience. They’re responsible for determining the salary and wages of different job positions in the company. They’re also involved in training employees for their development. 2. Marketing/Promotion Promotional activities and advertising are the best ways to communicate with your target customers for them to be able to know the company’s products and services. Effective marketing and promotional activities will drive long-term success, profitability and growth in market shares. This department is responsible for promoting the business to generate sales and help the company grow. Its function involves creating various marketing strategy and planning promotional campaigns. They are also responsible for monitoring competitor’s activities. One good example of a business that develops an effective marketing strategy is Velvet Caviar and how they have completely dominated the market for iPhone Xs Max Case. 3. Production It’s vital for business that the products are in good quality and free from defects. The production department is concerned with manufacturing the products, where inputs (raw materials) are converted into finished output through a series of production process. Their function is to ensure that the raw materials are made into finished product effectively and efficiently and in good quality. This department should also maintain the optimum inventory level. 4. Sales In every business, sales department plays the biggest role in any organization’s success. The sales department is responsible for generating revenue. The sales department is tasked to ensure that the sale of products and services results to profit. The sales department coordinates with the marketing department in terms of brand-awareness, product-launching and more. From the time the product left the production department. Sales need to develop ways on how to sell the product to their target users/customers. 5. Customer Service Support The Customer Service department is responsible for interacting with customers regarding inquiries, complaints and orders. It also includes having a help desk/reception and contact centers. It is important for a business to maintain and create relationship with their customers. Customer service should be provided before, during and after the purchase. This department focuses on giving good service support, especially to potential, new and existing customers. Part of a business’ customer relationship management is having an efficient customer service support. A good relationship with customers will create customer-loyalty. 6. Accounting and Finance Cash flow is the lifeblood of any business. It is important to manage the business’ cash outflows and inflows. The company can’t operate without money. If you can’t handle your money properly, you will lose control of your business. That is where the accounting and finance department comes in, which is a part of the organization that manages the company’s money. This department is responsible for accounting, auditing, planning, and organizing finances. They’re also responsible in producing the company’s financial statements. 7. Distribution No matter how good the product is, it’s deemed useless if it won’t reach customers. If goods are not suitable for the distribution channel, expenses involved in the distribution will be considered wasted. The distribution department is responsible for receiving orders and delivering orders to the customer at the right place, at the right time. 8. Research and Development Innovation is the key to every business’ future. Through innovation, it will open new competitive advantage for the company. Research and Development acts as the catalyst in the innovation process. They will be responsible for innovations in product, creating its new design and style. As well as for searching new ways of producing their products by being updated with regards to the latest technological and economic trends. 9. Administrative and Management The administrative and management is the backbone of the business. The administrative and management’s function is to handle the business, planning, decision-making, and also financial review. This department links with other departments to ensure the smooth flow of information and operations. 10. Operations The Operations department is held responsible for overseeing, designing and controlling the process of production and redesigning business operations if necessary. In a manufacturing company, operations department designs processes to produce the product efficiently. They also have to acquire materials and maintenance of equipment, supplies and more. 11. Information Technology Support Computers and information systems are very essential in business nowadays. The IT department acts as the backbone of a smooth operation involving the latest technology relevant to the business. This department is responsible for creating software/s for other departments, providing direct operating assistance in software-use and data-management to maintain functional areas in the organization. 12. Purchasing Purchasing is a basic function of an enterprise especially in manufacturing companies. The purchasing department is responsible for the procurement of raw materials, machineries, equipment and supplies. This department ensures that the materials needed are in the right quantity, at the right price, made available in the right time, from the right supplier. It is also their task to inform the top management of the changes of the price or material development that could affect the company’s sales. 13. Legal Department The legal department is tasked to oversee and identify legal issues in all departments. The department may also offer training and assistance with employee manuals to ensure that the company and its employees are kept up-to-date on workplace law and handles filing of legal documents on government agencies. They also handle customer complaints in a professional style and represent the company if sued. They act as the official & formal representative/s in behalf of the company or the founder. Emerging Trends in Business Research When measuring business, a statistic may only be meaningful if compared to another. It could be compared to the corresponding figure for another entity, such as another country or industry, or to the same measurement taken earlier. When businesses compare a measurement to a series of the same stat, they can identify a trend. Trend research and analysis allow companies to assess their work and predict future business. Emerging Trends are watched carefully in every industry and for general commerce. Following Emerging Trends in Business Research (i) Stock Indexes Stock indexes indicate the health of the stock market. One of the most important is the Standard and Poor’s 500. Standard and Poor’s chooses 500 stocks to create the index, which is a proxy for the entire stock market. Another often-quoted index is the Dow Jones Industrial Average. Dow Jones & Company includes 30 large, frequently traded stocks in its sample to measure the stock market. They’re issued by companies that are leaders in their industries. These indexes change whenever one of the stocks is traded–i.e., nearly constantly. Historical data for each is widely available for trend analysis. (ii) Monthly and Annual Retail Trade Reports The U.S. Census Bureau releases aggregate monthly and annual trade reports for the nation. They’re useful for evaluating business activity by industry. Data from the last 10 years is available online, allowing the researcher to identify long-term trends. (iii) Employment Statistics Each month, the Bureau of Labor Statistics (BLS) publishes employment statistics for the previous month, and these are crucial data. The BLS defines the labor force as made up of unemployed people over the age of 16 who are seeking employment. The BLS measures the work force in many ways. It releases the overall figure for unemployment, which is closely monitored by economists and the business community. An unemployment rate over 10 percent is undesirable. The labor force is also measured by demographics, earnings hours worked and participation–i.e., the percentage of the total population in the labor force. (iv) Consumer Confidence A research organization called the Conference Board measures the public’s opinion of the economy, as well as its expectations of the future economy. It publishes an aggregate figure of the Consumer Confidence Index in its Consumer Confidence Survey. Also included in the survey are figures measuring more specific attitudes, such as the short-term outlook, job prospects and assessment of present conditions. (v) Gross Domestic Product An economy’s Gross Domestic Product (GDP) is one of the most important macro-indicators of business activity. This figure represents the sum of all products–good and services–produced in the economy. The Real GDP adjusts the GDP for inflation for a more precise measure. Economist calculate the GDP of all countries and U.S. states. According to The World Bank, in 2009 the U.S. had a GDP of 14,256,300–nearly 25 percent of the world GDP. Research and the Scientific Method AKTUTHEINTACTONE3 MAR 2019 1 COMMENT For a clear perception of the term research, one should know the meaning of scientific method. The two terms, research and scientific method, are closely related. Research, as we have already stated, can be termed as “an inquiry into the nature of, the reasons for, and the consequences of any particular set of circumstances, whether these circumstances are experimentally controlled or recorded just as they occur. Further, research implies the researcher is interested in more than particular results; he is interested in the repeatability of the results and in their extension to more complicated and general situations.” On the other hand, the philosophy common to all research methods and techniques, although they may vary considerably from one science to another, is usually given the name of scientific method. Karl Pearson writes, “The scientific method is one and same in the branches (of science) and that method is the method of all logically trained minds … the unity of all sciences consists alone in its methods, not its material; the man who classifies facts of any kind whatever, who sees their mutual relation and describes their sequences, is applying the Scientific Method and is a man of science.” Scientific method is the pursuit of truth as determined by logical considerations. The ideal of science is to achieve a systematic interrelation of facts. Scientific method attempts to achieve “this ideal by experimentation, observation, logical arguments from accepted postulates and a combination of these three in varying proportions.” In scientific method, logic aid sin formulating propositions explicitly and accurately so that their possible alternatives become clear .Further, logic develops the consequences of such alternatives, and when these are compared with observable phenomena, it becomes possible for the researcher or the scientist to state which alternatives most in harmony with the observed facts. All this is done through experimentation and survey investigations which constitute the integral parts of scientific method. Experimentation is done to test hypotheses and to discover new relationships. If any, among variables. But the conclusions drawn on the basis of experimental data are generally criticized for either faulty assumptions, poorly designed experiments, badly executed experiments or faulty interpretations. As such the researcher must pay all possible attention while developing the experimental design and must state only probable inferences. The purpose of survey investigations may also be to provide scientifically gathered information to work as a basis for the researchers for their conclusions. Basic Postulates of Scientific Method The scientific method is, thus, based on certain basic postulates which can be stated as under: (i) It relies on empirical evidence; (ii) It utilizes relevant concepts; (iii) It is committed to only objective considerations; (iv) It presupposes ethical neutrality, i.e., it aims at nothing but making only adequate and correct statements about population objects; (v) It results into probabilistic predictions; (vi) Its methodology is made known to all concerned for critical scrutiny are for use in testing the conclusions through replication; (vii) It aims at formulating most general axioms or what can be termed as scientific theories. Thus, “the scientific method encourages a rigorous, impersonal mode of procedure dictated by the demands of logic and objective procedure.” Accordingly, scientific method implies an objective, logical and systematic method, i.e., a method free from personal bias or prejudice, a method to ascertain demonstrable qualities of a phenomenon capable of being verified, a method wherein the researcher is guided by the rules of logical reasoning, a method wherein the investigation proceeds inane orderly manner and a method that implies internal consistency. Characteristics of Scientific Method Five (5) Major Characteristics of the Scientific Method The scientific method is the system used by scientists to explore data, generate and test hypotheses, develop new theories and confirm or reject earlier results. Although the exact methods used in the different sciences vary (for example, physicists and psychologists work in very different ways), they share some fundamental attributes that may be called characteristics of the scientific method. 1. Empirical Observation The scientific method is empirical. That is, it relies on direct observation of the world, and disdains hypotheses that run counter to observable fact. This contrasts with methods that rely on pure reason (including that proposed by Plato) and with methods that rely on emotional or other subjective factors. 2. Replicable Experiments Scientific experiments are replicable. That is, if another person duplicates the experiment, he or she will get the same results. Scientists are supposed to publish enough of their method so that another person, with appropriate training, could replicate the results. This contrasts with methods that rely on experiences that are unique to a particular individual or a small group of individuals. 3. Provisional Results Results obtained through the scientific method are provisional; they are (or ought to be) open to question and debate. If new data arise that contradict a theory, that theory must be modified. For example, the phlogiston theory of fire and combustion was rejected when evidence against it arose. 4. Objective Approach The scientific method is objective. It relies on facts and on the world as it is, rather than on beliefs, wishes or desires. Scientists attempt (with varying degrees of success) to remove their biases when making observations. 5. Systematic Observation Strictly speaking, the scientific method is systematic; that is, it relies on carefully planned studies rather than on random or haphazard observation. Nevertheless, science can begin from some random observation. Isaac Asimov said that the most exciting phrase to hear in science is not “Eureka!” but “That’s funny.” After the scientist notices something funny, he or she proceeds to investigate it systematically Steps in Research Process Scientific Research involves a systematic process that focuses on being objective and gathering a multitude of information for analysis so that the researcher can come to a conclusion. This process is used in all research and evaluation projects, regardless of the research method (scientific method of inquiry, evaluation research, or action research). The process focuses on testing hunches or ideas in a park and recreation setting through a systematic process. In this process, the study is documented in such a way that another individual can conduct the same study again. Any research done without documenting the study so that others can review the process and results is not an investigation using the scientific research process. The scientific research process is a multiple-step process where the steps are interlinked with the other steps in the process. If changes are made in one step of the process, the researcher must review all the other steps to ensure that the changes are reflected throughout the process. Parks and recreation professionals are often involved in conducting research or evaluation projects within the agency. These professionals need to understand the eight steps of the research process as they apply to conducting a study. Step 1: Identify the Problem The first step in the process is to identify a problem or develop a research question. The research problem may be something the agency identifies as a problem, some knowledge or information that is needed by the agency, or the desire to identify a recreation trend nationally. Step 2: Review the Literature Now that the problem has been identified, the researcher must learn more about the topic under investigation. To do this, the researcher must review the literature related to the research problem. This step provides foundational knowledge about the problem area. The review of literature also educates the researcher about what studies have been conducted in the past, how these studies were conducted, and the conclusions in the problem area. In the obesity study, the review of literature enables the programmer to discover horrifying statistics related to the long-term effects of childhood obesity in terms of health issues, death rates, and projected medical costs. In addition, the programmer finds several articles and information from the Centers for Disease Control and Prevention that describe the benefits of walking 10,000 steps a day. The information discovered during this step helps the programmer fully understand the magnitude of the problem, recognize the future consequences of obesity, and identify a strategy to combat obesity (i.e., walking). Step 3: Clarify the Problem Many times the initial problem identified in the first step of the process is too large or broad in scope. In step 3 of the process, the researcher clarifies the problem and narrows the scope of the study. This can only be done after the literature has been reviewed. The knowledge gained through the review of literature guides the researcher in clarifying and narrowing the research project. In the example, the programmer has identified childhood obesity as the problem and the purpose of the study. This topic is very broad and could be studied based on genetics, family environment, diet, exercise, self-confidence, leisure activities, or health issues. All of these areas cannot be investigated in a single study; therefore, the problem and purpose of the study must be more clearly defined. The programmer has decided that the purpose of the study is to determine if walking 10,000 steps a day for three days a week will improve the individual’s health. This purpose is more narrowly focused and researchable than the original problem. Step 4: Clearly Define Terms and Concepts Terms and concepts are words or phrases used in the purpose statement of the study or the description of the study. These items need to be specifically defined as they apply to the study. Terms or concepts often have different definitions depending on who is reading the study. To minimize confusion about what the terms and phrases mean, the researcher must specifically define them for the study. In the obesity study, the concept of “individual’s health” can be defined in hundreds of ways, such as physical, mental, emotional, or spiritual health. For this study, the individual’s health is defined as physical health. The concept of physical health may also be defined and measured in many ways. In this case, the programmer decides to more narrowly define “individual health” to refer to the areas of weight, percentage of body fat, and cholesterol. By defining the terms or concepts more narrowly, the scope of the study is more manageable for the programmer, making it easier to collect the necessary data for the study Step 5: Define the Population Research projects can focus on a specific group of people, facilities, park development, employee evaluations, programs, financial status, marketing efforts, or the integration of technology into the operations. For example, if a researcher wants to examine a specific group of people in the community, the study could examine a specific age group, males or females, people living in a specific geographic area, or a specific ethnic group. Literally thousands of options are available to the researcher to specifically identify the group to study. The research problem and the purpose of the study assist the researcher in identifying the group to involve in the study. In research terms, the group to involve in the study is always called the population. Defining the population assists the researcher in several ways. First, it narrows the scope of the study from a very large population to one that is manageable. Second, the population identifies the group that the researcher’s efforts will be focused on within the study. This helps ensure that the researcher stays on the right path during the study. Finally, by defining the population, the researcher identifies the group that the results will apply to at the conclusion of the study. Step 6: Develop the Instrumentation Plan The plan for the study is referred to as the instrumentation plan. The instrumentation plan serves as the road map for the entire study, specifying who will participate in the study; how, when, and where data will be collected; and the content of the program. In the obesity study, the researcher has decided to have the children participate in a walking program for six months. The group of participants is called the sample, which is a smaller group selected from the population specified for the study. The study cannot possibly include every 10- to 12-year-old child in the community, so a smaller group is used to represent the population. The researcher develops the plan for the walking program, indicating what data will be collected, when and how the data will be collected, who will collect the data, and how the data will be analyzed. The instrumentation plan specifies all the steps that must be completed for the study. This ensures that the programmer has carefully thought through all these decisions and that she provides a step-by-step plan to be followed in the study. Step 7: Collect Data Once the instrumentation plan is completed, the actual study begins with the collection of data. The collection of data is a critical step in providing the information needed to answer the research question. Every study includes the collection of some type of data—whether it is from the literature or from subjects—to answer the research question. Data can be collected in the form of words on a survey, with a questionnaire, through observations, or from the literature. In the obesity study, the programmers will be collecting data on the defined variables: weight, percentage of body fat, cholesterol levels, and the number of days the person walked a total of 10,000 steps during the class. The researcher collects these data at the first session and at the last session of the program. These two sets of data are necessary to determine the effect of the walking program on weight, body fat, and cholesterol level. Once the data are collected on the variables, the researcher is ready to move to the final step of the process, which is the data analysis. Step 8: Analyze the Data All the time, effort, and resources dedicated to steps 1 through 7 of the research process culminate in this final step. The researcher finally has data to analyze so that the research question can be answered. In the instrumentation plan, the researcher specified how the data will be analyzed. The researcher now analyzes the data according to the plan. The results of this analysis are then reviewed and summarized in a manner directly related to the research questions. In the obesity study, the researcher compares the measurements of weight, percentage of body fat, and cholesterol that were taken at the first meeting of the subjects to the measurements of the same variables at the final program session. These two sets of data will be analyzed to determine if there was a difference between the first measurement and the second measurement for each individual in the program. Then, the data will be analyzed to determine if the differences are statistically significant. If the differences are statistically significant, the study validates the theory that was the focus of the study. The results of the study also provide valuable information about one strategy to combat childhood obesity in the community. Formulation of the Research Problem Five (5) WAYS TO FORMULATE THE RESEARCH PROBLEM 1. Specify the Research Objectives A clear statement defining your objectives will help you develop effective research. It will help the decision makers evaluate the research questions your project should answer as well as the research methods your project will use to answer those questions. It’s critical that you have manageable objectives. (Two or three clear goals will help to keep your research project focused and relevant.) 2. Review the Environment or Context of the Research Problem As a marketing researcher, you must work closely with your team of researchers in defining and testing environmental variables. This will help you determine whether the findings of your project will produce enough information to be worth the cost. In order to do this, you have to identify the environmental variables that will affect the research project and begin formulating different methods to control these variables. 3. Explore the Nature of the Problem Research problems range from simple to complex, depending on the number of variables and the nature of their relationship. Sometimes the relationship between two variables is directly related to a problem or questions, and other times the relationship is entirely unimportant. If you understand the nature of the research problem as a researcher, you will be able to better develop a solution for the problem. To help you understand all dimensions, you might want to consider focus groups of consumers, sales people, managers, or professionals to provide what is sometimes much needed insight into a particular set of questions or problems. 4. Define the Variable Relationships Marketing plans often focus on creating a sequence of behaviors that occur over time, as in the adoption of a new package design, or the introduction of a new product. Such programs create a commitment to follow some behavioral pattern or method in the future. Studying such a process involves:- • Determining which variables affect the solution to the research problem. • Determining the degree to which each variable can be controlled and used for the purposes of the company. • Determining the functional relationships between the variables and which variables are critical to the solution of the research problem. • During the problem formulation stage, you will want to generate and consider as many courses of action and variable relationships as possible. 5. The Consequences of Alternative Courses of Action There are always consequences to any course of action used in one or more projects. Anticipating and communicating the possible outcomes of various courses of action is a primary responsibility in the research process. Research Proposal: Element of Research Proposal A Research Proposal is a document proposing a research project, generally in the sciences or academia, and generally constitutes a request for sponsorship of that research. Proposals are evaluated on the cost and potential impact of the proposed research, and on the soundness of the proposed plan for carrying it out. Document that is typically written by a scientist or academic which describes the ideas for an investigation on a certain topic. The research proposal outlines the process from beginning to end and may be used to request financing for the project, certification for performing certain parts of research of the experiment, or as a required task before beginning a college dissertation. So, let’s take a look at what a research proposal is. When someone is interested in obtaining support for research, they often write a research proposal. These proposals are intended to convince people that your ideas and projects are important. They strive to explain how you can satisfactorily complete the project. A research proposal needs to let people know why the project is a good and/or needed idea and that you understand what information and studies are already out there. Keep in mind that the way the proposal is written is also important, as grammar, structure, and content can make a difference in whether or not the proposal is accepted or rejected. Element of Research Proposal Writing a good proposal will help you manage your time so that you can complete the quarter with three papers that meet your objectives. The specific format and content of these elements may vary; they may not always appear as separate sections or in the order listed here. • Background of the study • Problem Statement • Objectives of the study • Significance of the study • Limitation of the study • Definition of terms • Literature Review • Methodology 1. Background of the study The main idea of the background of study is to establish the area of research in which your work belongs, and to provide a context for the research problem. It also provides information to the research topic. In an introduction, the writer should create: • Reader interest in the topic, • Lay the broad foundation for the problem that leads to the study. 2. Statement of the problem When you start a research, you have a question that you wish to seek answer for. The question leads to a problem that needs to be solved by the research. Begin the research with a description of the problem or a thesis statement. 3. Objectives of the study States what your research hopes to accomplish. 4. Significance of the study Why your research is important and what contributions will it give to the field. It is also advised to state how your findings can make a difference and why is it important that the research be carried out. 5. Limitation of the study It is not possible to include ALL aspects of a particular problem. State what is not included. Specify the boundaries of you research. A too wide area of investigation is impractical and will lead to problems. 6. Definition of terms Terms or concepts that you use should be defined and explained unless they are familiar or obvious. You should refer to authoritative sources for definitions. 7. Literature Review This section need not be lengthy but it should reflect your understanding of relevant bodies of literature. List all pertinent papers or reports that you have consulted in preparing the proposal; include conversations with faculty, peers or other experts. A well-written review provides a sense of critical issues which form the background for your own work this quarter. By doing this it shows that you are aware of the literature study that is required in your research area. Your review a substantial amount of reading materials before writing your proposal. It shows that you have sufficient theoretical knowledge in your chosen research area. By reviewing related literature at this stage, it will make you: • Aware of other similar work which has been done. • Expose methodologies that have been adopted and which you may use or adapt. • Provide sources of information that you do not have yet. By reviewing related literature at this stage, it will inform you: • If a chosen area has already been researched extensively. • Approaches that you do not know of before. 8. Methodology This section is the heart of the proposal because it provides insight into your perspective as well as details on how you plan to carry out the project. How will you accomplish your objective(s)? What theories or concepts will guide the study? How do they or might they suggest the specific hypotheses or research questions? Where might you run into obstacles? Explain the specifics of what you want present in your project (statistical data, comparisons of historical and recent data, the evolution of a paradigm, etc.). One way to do this is by developing a rough outline of the major topics and sub-topics that you will investigate. Your timeline and a very rough scope (past – current – future) has been pre-determined. If outside organizations involved, explain how you are going to get hold of the data. Indicate why the methodology is used. If existing methodology is not to be used, explain why you need to use an adapted methodology. Evaluating a Research Proposal 1. Make Sure the Proposal Responds to Your Objectives The proposal process begins before the research firm offers you their take on how they recommend you conduct your survey and for what price. Instead, the process begins with the first discussion you have about the survey. Did the researcher take the opportunity to ask you specific questions about your objectives, the group of people you’d like to survey, and your ultimate goals? Details regarding your situation should pop up throughout the proposal. Surveys whose proposals do not incorporate your individual needs cannot possibly meet them. For example, one research firm may recommend a very straightforward survey plan with a low price tag. Another may take time to recommend a more involved approach that considers your needs and objectives but at a higher price. Ask the higher priced firm to account for the differences. They should have no problem explaining the rationale behind their recommendations. On the other hand, have the low-cost provider explain why they believe such a straightforward survey is right for you. 2. Sampling Plan When reviewing the sampling plan, make sure the proposal mentions sample size, response rate, number of responses, and maximum sampling error. These figures help you determine whether you’ll be able to confidently project the survey results to your entire population of interest. If you’re unsure of the impact these figures have on the quality of your results, ask the researcher. They should be able to explain them in terms you can understand. If you’re interested in learning details about specific segments of your circulation, make sure that the sampling plan accounts for them. Sometimes a simple random selection of names is enough. In other cases small, but important groups should be over-sampled in order to collect enough responses to tell a story about the group. 3. Questionnaire The quantity and types of information sought from respondents will impact cost. Quantity encompasses the number of questionnaire pages and number of variables to process. Type refers to how the questions will be processed, the data entry involved, and whether all or just some data will be cleaned. No evaluation is complete until you know the approximate number and types of questions planned for the survey. The number of open-ended questions should be included as well because open-ended questions that capture verbatim responses can impact the response rate and the price of your survey. While these details can change during design, knowing the starting point helps establish what additional questions, pages, or transcribed questions will add to your bottom-line. In addition, make sure the proposal clearly indicates who will develop the questionnaire content. Also, determine if it includes enough collaboration time to be sufficiently customized to meet your particular needs. 4. Data Collection Approach For online surveys that invite respondents via email to respond to a web-based survey, paying attention to the data collection series can mean the difference between conducting a successful survey and one that frustrates your circulation. Multiple emails to respondents can encourage response by arriving in their inbox throughout the day. Some of the times the emails arrive will be more convenient for your sample than others. However, you should only send follow-up emails to non-respondents. Out of privacy concerns, sample members should have the opportunity to opt-out with each contact. Outbound emails must also be coded to only allow one response per person, and to prevent others from taking the survey. Proposals for mailed surveys should clearly outline the data collection series and each component of the survey kit. A sophisticated mailing series can efficiently improve response rates and increase the quality of data. Some cost-effective techniques that can boost response rates include the use of incentives, stamped reply envelopes, follow-up survey kits to non-respondents, alert letters or postcards, and personalization. 5. Data Processing Your proposal should highlight the steps the research company will take to make sure that the data is accurate and representative. Depending on the type of survey, checking logic, consistency, and outliers can take a significant amount of time. You must have some process noted to identify inconsistent answers for surveys that collect a significant amount of numerical data (salary survey, market studies, budget planning). Finally, some percentage of mailed surveys need to be verified for data entry accuracy. 6. Analysis A straightforward analysis of survey data can meet many objectives. In other cases, a multivariate statistical analysis will provide deeper insights to achieve your objectives— making results easier to use. If your objectives include learning about separate segments of your circulation, cross tabulations should be specified. 7. Deliverables A variety of reporting options exist for a survey. These include but are not limited to data tables, a summary of the results, in-depth analysis, and graphed presentations. As a result, you need to understand exactly what you’ll receive following your survey and in what format. If the report and data table samples aren’t offered, ask for them. You want to make sure that data tables are easy to read and attractive. Consider how well write-ups enhance the clarity and usability of the results. If you plan to use your reports in presentations, make sure they’ll reflect well on you. Don’t forget to consider the time you’ll have to take to reformat the results. Unit 2 Research Design Research Design, Feature of a Good Research Design Research Design is defined as a framework of methods and techniques chosen by a researcher to combine various components of research in a reasonably logical manner so that the research problem is efficiently handled. It provides insights about “how” to conduct research using a particular methodology. Types of Research Design A researcher must have a clear understanding of the various types of research design to select which type of research design to implement for a study. Research design can be broadly classified into quantitative and qualitative research design. 1. Qualitative Research Design Qualitative research is implemented in cases where a relationship between collected data and observation is established on the basis of mathematical calculations. Theories related to a naturally existing phenomenon can be proved or disproved using mathematical calculations. Researchers rely on qualitative research design where they are expected to conclude “why” a particular theory exists along with “what” respondents have to say about it. 2. Quantitative Research Design Quantitative research is implemented in cases where it is important for a researcher to have statistical conclusions to collect actionable insights. Numbers provide a better perspective to make important business decisions. Quantitative research design is important for the growth of any organization because any conclusion drawn on the basis of numbers and analysis will only prove to be effective for the business. Further, research design can be divided into five types : (I) Descriptive Research Design: In a descriptive research design, a researcher is solely interested in describing the situation or case under his/her research study. It is a theory-based research design which is created by gather, analyze and presents collected data. By implementing an in-depth research design such as this, a researcher can provide insights into the why and how of research. (II) Experimental Research Design: Experimental research design is used to establish a relationship between the cause and effect of a situation. It is a causal research design where the effect caused by the independent variable on the dependent variable is observed. For example, the effect of an independent variable such as price on a dependent variable such as customer satisfaction or brand loyalty is monitored. It is a highly practical research design method as it contributes towards solving a problem at hand. The independent variables are manipulated to monitor the change it has on the dependent variable. It is often used in social sciences to observe human behavior by analyzing two groups – effect of one group on the other. (III) Correlational Research Design: Correlational research is a non-experimental research design technique which helps researchers to establish a relationship between two closely connected variables. Two different groups are required to conduct this research design method. There is no assumption while evaluating a relationship between two different variables and statistical analysis techniques are used to calculate the relationship between them. Correlation between two variables is concluded using a correlation coefficient, whose value ranges between -1 and +1. If the correlation coefficient is towards +1, it indicates a positive relationship between the variables and -1 indicates a negative relationship between the two variables. (IV) Diagnostic Research Design: In the diagnostic research design, a researcher is inclined towards evaluating the root cause of a specific topic. Elements that contribute towards a troublesome situation are evaluated in this research design method. There are three parts of diagnostic research design: • Inception of the issue • Diagnosis of the issue • Solution for the issue (V) Explanatory Research Design: In exploratory research design, the researcher’s ideas and thoughts are key as it is primarily dependent on their personal inclination about a particular topic. Explanation about unexplored aspects of a subject is provided along with details about what, how and why related to the research questions. Features of a Good Research Design The features of good research design is often characterized by adjectives like flexible, appropriate, efficient, economical and so on. Generally, the design which minimizes bias and maximizes the reliability of the data collected and analyzed is considered a good design. The design which gives the smallest experimental error is supposed to be the best design in many investigations. Similarly, a design which yields maximal information and provides an opportunity for considering many different aspects of a problem is considered most appropriate and efficient design in respect of many research problems. Thus, the question of good design is related to the purpose or objective of the research problem and also with the nature of the problem to be studied. A design may be quite suitable in one case, but may be found wanting in one respect or the other in the context of some other research problem. One single design cannot serve the purpose of all types of research problems. A research design appropriate for a particular research problem, usually involves the consideration of the following factors: 1. The means of obtaining information; 2. The availability and skills of the researcher and his staff, if any; 3. The objective of the problem to be studied; 4. The nature of the problem to be studied; and 5. The availability of time and money for the research work. Use of a Good Research Design: Qualitative and Quantitative Approach When to use qualitative vs. quantitative research Quantitative data can help you see the big picture. Qualitative data adds the details and can also give a human voice to your survey results. Let’s see how to use each method in a research project. • Formulating hypotheses: Qualitative research helps you gather detailed information on a topic. You can use it to initiate your research by discovering the problems or opportunities people are thinking about. Those ideas can become hypotheses to be proven through quantitative research. • Validating your hypotheses: Quantitative research will get you numbers that you can apply statistical analysis to in order to validate your hypotheses. Was that problem real or just someone’s perception? The hard facts obtained will enable you to make decisions based on objective observations. • Finding general answers: Quantitative research usually has more respondents than qualitative research because it is easier to conduct a multiple-choice survey than a series of interviews or focus groups. Therefore it can help you definitely answer broad questions like: Do people prefer you to your competitors? Which of your company’s services are most important? What ad is most appealing? • Incorporating the human element: Qualitative research can also help in the final stages of your project. The quotes you obtained from open-ended questions can put a human voice to the objective numbers and trends in your results. Many times it helps to hear your customers describe your company in their own words to uncover your blind spots. Qualitative data will get you that. How to balance qualitative and quantitative research? These two research methods don’t conflict with each other. They actually work much better as a team. In a world of Big Data, there’s a wealth of statistics and figures that form the strong foundation on which your decisions can rest. But that foundation is incomplete without the information collected from real people that gives the numbers meaning. So how do you put these two forms of research together? Qualitative research is almost always the starting point when you seek to discover new problems and opportunities–which will help you do deeper research later. Quantitative data will give you measurements to confirm each problem or opportunity and understand it. How about an example? Let’s say you held a conference and wanted feedback from your attendees. You can probably already measure several things with quantitative research, such as attendance rate, overall satisfaction, quality of speakers, value of information given, etc. All these questions can be given in a closed-ended and measurable way. But you also may want to provide a few open-ended, qualitative research questions to find out what you may have overlooked. You could use questions like: • What did you enjoy most about the conference? • How could we improve your experience? • Is there any feedback on the conference you think we should be aware of? If you discover any common themes through these qualitative questions, you can decide to research them more in depth, make changes to your next event, and make sure to add quantitative questions about these topics after the next conference. For example, let’s say several attendees said that their least favorite thing about the conference was the difficult-to-reach location. Next time, your survey might ask quantitative questions like how satisfied people were with the location, or let respondents choose from a list of potential sites they would prefer. Open-ended vs. close-ended questions. A good way of recognizing when you want to switch from one method to the other is to look at your open-ended questions and ask yourself why you are using them. For example, if you asked: “What do you think of our ice cream prices?”, people would give you feedback in their own words and you will probably get some out-of-the-box answers. If that’s not what you’re looking for, you should consider using an easily quantifiable response. For example: Relative to our competitors, do you think our ice cream prices are: • Higher • About the same • Lower This kind of question will give your survey respondents clarity and in turn it will provide you with consistent data that is easy to analyze. How to get qualitative data? There are many methods you can use to conduct qualitative research that will get you richly detailed information on your topic of interest. • One-on-one conversations that go deep into the topic at hand. • Case studies. Collections of client stories from in-depth interviews. • Expert opinions. High-quality information from well-informed sources. • Focus groups.In-person or online conversation with small groups of people to listen to their views on a product or topic. • Open-ended survey questions.A text box in a survey that lets the respondent express their thoughts on the matter at hand freely. • Observational research.Observing people during the course of their habitual routines to understand how they interact with a product, for example. However, this open-ended method of research does not always lend itself to bringing you the most accurate results to big questions. And analyzing the results is hard because people will use different words and phrases to describe their points of view, and may not even talk about the same things if they find space to roam with their responses. In some cases, it may be more effective to go ‘full quantitative’ with your questions. Why Collect Quantitative Data? Qualitative survey questions can run the risk of being too vague. To avoid confusing your respondents, you may want to eschew questions like, “What do you think about our internet service?” Instead you could ask a closed-ended, quantitative question like in the following example. The internet service is reliable: • Always • Most of the time • About half the time • Once in a while • Never Qualitative questions take longer to answer. Survey respondents don’t always have the patience to reflect on what they are being asked and write long responses that accurately express their views. It’s much faster to choose one of several pre-loaded options in a questionnaire. Using quantitative questions helps you get more questions in your survey and more responses out of it. Quantitative survey questions are just more… quantifiable. Even word responses in closed-ended questionnaires can be assigned numerical values that you can later convert into indicators and graphs. This means that the overall quality of the data is better. Remember that the most accurate data leads you to the best possible decisions. Quantitative questions: How long have you been a customer of our company? • This is my first purchase • Less than six months • Six months to a year • 1-2 years • 3 or more years • I haven’t made a purchase yet How likely are you to purchase any of our products again? • Extremely likely • Very likely • Somewhat likely • Not so likely • Not at all likely Qualitative follow-up question: Do you have any other comments, questions, or concerns? Quantitative questions: When you make a mistake, how often does your supervisor respond constructively? • Always • Most of the time • About half of the time • Once in a while • Never Exploratory Research: Concept, Types Exploratory Research is defined as a research used to investigate a problem which is not clearly defined. It is conducted to have a better understanding of the existing problem, but will not provide conclusive results. For such a research, a researcher starts with a general idea and uses this research as a medium to identify issues that can be the focus for future research. An important aspect here is that the researcher should be willing to change his/her direction subject to the revelation of new data or insight. Such a research is usually carried out when the problem is at a preliminary stage. It is often referred to as grounded theory approach or interpretive research as it used to answer questions like what, why and how. For example: Consider a scenario where a juice bar owner feels that increasing the variety of juices will enable increase in customers, however he is not sure and needs more information. The owner intends to carry out an exploratory research to find out and hence decides to do an exploratory research to find out if expanding their juices selection will enable him to get more customers of if there is a better idea. Types of Exploratory Research While it may sound a little difficult to research something that has very little information about it, there are several methods which can help a researcher figure out the best research design, data collection methods and choice of subjects. There are two ways in which research can be conducted namely primary and secondary.. Under these two types, there are multiple methods which can used by a researcher. The data gathered from these research can be qualitative or quantitative. Some of the most widely used research designs include the following: 1. Primary Research Methods Primary research is information gathered directly from the subject. It can be through a group of people or even an individual. Such a research can be carried out directly by the researcher himself or can employ a third party to conduct it on their behalf. Primary research is specifically carried out to explore a certain problem which requires an in-depth study. (I) Surveys/polls: Surveys/polls are used to gather information from a predefined group of respondents. It is one of the most important quantitative method. Various types of surveys or polls can be used to explore opinions, trends, etc. With the advancement in technology, surveys can now be sent online and can be very easy to access. For instance, use of a survey app through tablets, laptops or even mobile phones. This information is also available to the researcher in real time as well. Nowadays, most organizations offer short length surveys and rewards to respondents, in order to achieve higher response rates. For example: A survey is sent to a given set of audience to understand their opinions about the size of mobile phones when they purchase one. Based on such information organization can dig deeper into the topic and make business related decision. (II) Interviews: While you may get a lot of information from public sources, but sometimes an in person interview can give in-depth information on the subject being studied. Such a research is a qualitative research method. An interview with a subject matter expert can give you meaningful insights that a generalized public source won’t be able to provide. Interviews are carried out in person or on telephone which have open-ended questions to get meaningful information about the topic. For example: An interview with an employee can give you more insights to find out the degree of job satisfaction, or an interview with a subject matter expert of quantum theory can give you in-depth information on that topic. (III) Focus groups: Focus group is yet another widely used method in exploratory research. In such a method a group of people is chosen and are allowed to express their insights on the topic that is being studied. Although, it is important to make sure that while choosing the individuals in a focus group they should have a common background and have comparable experiences. For example: A focus group helps a research identify the opinions of consumers if they were to buy a phone. Such a research can help the researcher understand what the consumer value while buying a phone. It may be screen size, brand value or even the dimensions. Based on which the organization can understand what are consumer buying attitudes, consumer opinions, etc. (IV) Observations: Observation research can be qualitative observation or quantitative observation. Such a research is done to observe a person and draw the finding from their reaction to certain parameters. In such a research, there is no direct interaction with the subject. For example: An FMCG company wants to know how it’s consumer react to the new shape of their product. The researcher observes the customers first reaction and collects the data, which is then used to draw inferences from the collective information. 2. Secondary Research Methods Secondary research is gathering information from previously published primary research. In such a research you gather information from sources likes case studies, magazines, newspapers, books, etc. (I) Online Research: In today’s world, this is one of the fastest way to gather information on any topic. A lot of data is readily available on the internet and the researcher can download it whenever he needs it. An important aspect to be noted for such a research is the genuineness and authenticity of the source websites that the researcher is gathering the information from. For example: A researcher needs to find out what is the percentage of people that prefer a specific brand phone. The researcher just enters the information he needs in a search engine and gets multiple links with related information and statistics. (II) Literature Research: Literature research is one of the most inexpensive method used for discovering a hypothesis. There is tremendous amount of information available in libraries, online sources, or even commercial databases. Sources can include newspapers, magazines, books from library, documents from government agencies, specific topic related articles, literature, Annual reports, published statistics from research organisations and so on. However, a few things have to be kept in mind while researching from these sources. Government agencies have authentic information but sometimes may come with a nominal cost. Also, research from educational institutions is generally overlooked, but in fact educational institutions carry out more number of research than any other entities. Furthermore, commercial sources provide information on major topics like political agendas, demographics, financial information, market trends and information, etc. For example: A company has low sales. It can be easily explored from available statistics and market literature if the problem is market related or organisation related or if the topic being studied is regarding financial situation of the country, then research data can be accessed through government documents or commercial sources. (III) Case Study Research: Case study research can help a researcher with finding more information through carefully analyzing existing cases which have gone through a similar problem. Such analysis are very important and critical especially in today’s business world. The researcher just needs to make sure he analyses the case carefully in regards to all the variables present in the previous case against his own case. It is very commonly used by business organisations or social sciences sector or even in the health sector. For example: A particular orthopedic surgeon has the highest success rate for performing knee surgeries. A lot of other hospitals or doctors have taken up this case to understand and benchmark the method in which this surgeon does the procedure to increase their success rate. Qualitative Techniques: Projective Techniques, Depth Interviews, Experience survey, Focus groups, Observation 1. Projective Techniques Projective Techniques are indirect and unstructured methods of investigation which have been developed by the psychologists and use projection of respondents for inferring about underline motives, urges or intentions which cannot be secure through direct questioning as the respondent either resists to reveal them or is unable to figure out himself. These techniques are useful in giving respondents opportunities to express their attitudes without personal embarrassment. These techniques helps the respondents to project his own attitude and feelings unconsciously on the subject under study. Thus Projective Techniques play a important role in motivational researches or in attitude surveys. Important Projective Techniques (I) Word Association Test. (II) Completion Test. (III) Construction Techniques (IV) Expression Techniques (I) Word Association Test: An individual is given a clue or hint and asked to respond to the first thing that comes to mind. The association can take the shape of a picture or a word. There can be many interpretations of the same thing. A list of words is given and you don’t know in which word they are most interested. The interviewer records the responses which reveal the inner feeling of the respondents. The frequency with which any word is given a response and the amount of time that elapses before the response is given are important for the researcher. For eg: Out of 50 respondents 20 people associate the word “ Fair” with “Complexion”. (II) Completion Test: In this the respondents are asked to complete an incomplete sentence or story. The completion will reflect their attitude and state of mind. (III) Construction Test: This is more or less like completion test. They can give you a picture and you are asked to write a story about it. The initial structure is limited and not detailed like the completion test. For eg: 2 cartoons are given and a dialogue is to written. (IV) Expression Techniques: In this the people are asked to express the feeling or attitude of other people. Disadvantages of Projective Techniques • Highly trained interviewers and skilled interpreters are needed. • Interpreter’s bias can be there. • It is a costly method. • The respondent selected may not be representative of the entire population. 2. Depth Interviews A qualitative data collection method, in-depth interviews offer the opportunity to capture rich, descriptive data about people’s behaviors, attitudes and perceptions, and unfolding complex processes. They can be used as a standalone research method or as part of a multi method design, depending on the needs of the research. How is an in depth interview carried out? In depth interviews are normally carried out face to face so that a rapport can be created with respondents. Body language is also used to add a high level of understanding to the answers. Telephones can also be used by a skilled researcher with little loss of data and at a tenth of the cost. The style of the interview depends on the interviewer. Successful in-depth interviewers listen rather than talk. They have a clear line of questioning and use body language to build rapport. 3. Experience Survey Most often taking the form of a text box in a survey, open-ended questions allow your respondents to provide a unique answer (as opposed to providing a list of predetermined responses to select from). This approach gives respondents the freedom to say exactly what they feel about a topic, which provides you with exploratory data that may reveal unforeseen opportunities, issues, or quotes. You can then use this information to support the hard numbers you’ve collected in the survey. Often it is these quotes or examples that create more powerful statements than many averages and percentages. 4. Focus Groups Usually done in person or online, a focus group asks a small group of people to discuss their thoughts on a given subject. A focus group allows you to gauge the reactions of a small number of your target audience in a controlled but free-flowing group discussion. This form of research is a great way to test how your target audience would perceive a new product or marketing strategy. 5. Observational Research This approach involves observing customers or people in their actual element. A perfect example would be watching shoppers while they visit your store. How long does it take them to find what they are looking for? Do they look comfortable interacting with your staff? Where do they go first, second? When do they leave without making a purchase? These real-world observations can lead you to findings that more direct forms of research, like focus groups and interviews, would miss. Descriptive Research Design: Concept, Types and Uses Descriptive Research is research used to “describe” a situation, subject, behavior, or phenomenon. It is used to answer questions of who, what, when, where, and how associated with a particular research question or problem. Descriptive studies are often described as studies that are concerned with finding out “what is”. It attempts to gather quantifiable information that can be used to statistically analyze a target audience or a particular subject. Description research is used to observe and describe a research subject or problem without influencing or manipulating the variables in any way. Hence, these studies are really correlational or observational, and not truly experimental. This type of research is conclusive in nature, rather than exploratory. Therefore, descriptive research does not attempt to answer “why” and is not used to discover inferences, make predictions or establish causal relationships. Descriptive research is used extensively in social science, psychology and educational research. It can provide a rich data set that often brings to light new knowledge or awareness that may have otherwise gone unnoticed or encountered. It is particularly useful when it is important to gather information with disruption of the subjects or when it is not possible to test and measure large numbers of samples. It allows researchers to observe natural behaviors without affecting them in any way. Following is a list of research questions or problems that may lend themselves to descriptive research:- • Market researchers may want to observe the habits of consumers. • A company may be wanting to evaluate the morale of the staff. • A school district may research whether or not students are more likely to access online textbooks than to use printed copies. • A school district may wish to assess teachers’ attitudes about using technology in the classroom. • An educational software company may want to know what aspects of the software make it more likely to be used by students. • A researcher may wish to study the impact of hands-on activities and laboratory experiments on students’ perceptions of science. • A researcher could be studying whether or not the availability of hiking/biking trails increases the physical activity levels in a neighborhood. Types of Descriptive Research In some types of descriptive research, the researcher does not interact with the subjects. In other types, the researcher does interact with the subjects and collects information directly from them. Some descriptive studies may be cross-sectional, whereby the researcher has a one-time interaction with the test subjects. Other studies may be longitudinal, where the same test subjects are followed over time. There are three main methods that may be used in descriptive research:- • Observational Method: Used to review and record the actions and behaviors of a group of test subjects in their natural environment. The research typically does not have interaction with the test subject. • Case Study Method: This is a much more in-depth student of an individual or small group of individuals. It may or may not involve interaction with the test subjects. • Survey Method: Researchers interact with individual test subjects by collecting information through the use of surveys or interviews. Concept of Cross Sectional and Longitudinal Research Cross-Sectional Study is defined as an observational study where data is collected as a whole to study a population at a single point in time to examine the relationship between variables of interest. In an observational study, a researcher records information about the participants without changing anything or manipulating the natural environment in which they exist. The most important feature of a cross-sectional study is that it can compare different samples at one given point in time. For example, a researcher wants to understand the relationship between joggers and level of cholesterol, he/she might want to choose two age groups of daily joggers, one group is below 30 but more than 20 and the other, above 30 but below 40 and compare these to cholesterol levels amongst non-joggers in the same age categories. The researcher at this point in time can create subsets for gender, but cannot consider past cholesterol levels as this would be outside the given parameters for cross-sectional studies. Cross-sectional studies allow the study of many variables at a given time. Researchers can look at age, gender, income etc in relation to jogging and cholesterol at a very little or no additional cost involved. However, there is one downside to cross-sectional study, this type of study is not able to provide a definitive relation between cause and effect relation (a cause and effect relationship is one where one action (cause) makes another event happen (effect), for example, without an alarm, you might oversleep.) This is majorly because cross-sectional study offers a snapshot of a single moment in time, this study doesn’t consider what happens before or after. Therefore in this example stated above it is difficult to know if the daily joggers had low cholesterol levels before taking up jogging or if the activity helped them to reduce cholesterol levels that were previously high. Longitudinal Study Longitudinal study, like the cross-sectional study, is also an observational study, in which data is gathered from the same sample repeatedly over an extended period of time. Longitudinal study can last from a few years to even decades depending on what kind of information needs to be obtained. The benefit of conducting longitudinal study is that researchers can make notes of the changes, make observations and detect any changes in the characteristics of their participants. One of the important aspects here is that longitudinal study extends beyond a single frame in time. As a result, they can establish a proper sequence of the events occurred. Continuing with the example, in longitudinal study a researcher wishes to look at the changes in cholesterol level in women above the age of 30 but below 40 years who have jogged regularly over the last 10 years. In longitudinal study setup, it would be possible to account for cholesterol levels at the start of the jogging regime, therefore longitudinal studies are more likely to suggest a cause-and-effect relationship. Overall, research should drive the design, however, sometimes as the research progresses it helps determine which of the design is more appropriate. Cross-sectional studies can be done more quickly as compared to longitudinal studies. That’s why a researcher may start off with cross-sectional study and if needed follow it up with longitudinal studies. Differences between Cross-Sectional Study and Longitudinal Study Cross-sectional and longitudinal study both are types of observational study, where the participants are observed in their natural environment. There are no alteration or changes in the environment in which the participants exist. Despite this marked similarity, there are distinctive differences between both these forms of study. Let us analyze the differences between cross-sectional study and longitudinal study. Cross-sectional Study Longitudinal Study Cross-sectional studies are quick to conduct as compared to longitudinal studies. Longitudinal studies may vary from a few years to even decades. A cross-sectional study is conducted at a given point in time. A longitudinal study requires a researcher to revisit participants of the study at proper intervals. Cross-sectional study is conducted with different samples. Longitudinal study is conducted with the same sample over the years. Cross-sectional studies cannot pin down cause-and-effect relationship. Longitudinal study can justify cause-and-effect relationship. Multiple variables can be studied at a single point in time. Only one variable is considered to conduct the study. Cross-sectional study is comparatively cheaper. Since the study goes on for years longitudinal study tends to get expensive. Conclusion It is true, study design greatly depends on the nature of research questions. Whenever a researcher decides to collect data by deploying surveys to his/her participants, what matters the most are the survey questions that are placed tactfully, so as to gather meaningful insights. In other words, to know what kind of information a study should be able to collect is the first step in determining how to carry out the rest of the study. What steps need to be included and what can be given a pass. Experimental Design: Concept of Cause The word experimental research has a range of definitions. In the strict sense, experimental research is what we call a true experiment. This is an experiment where the researcher manipulates one variable, and control/randomizes the rest of the variables. It has a control group, the subjects have been randomly assigned between the groups, and the researcher only tests one effect at a time. It is also important to know what variable(s) you want to test and measure. A very wide definition of experimental research, or a quasi-experiment, is research where the scientist actively influences something to observe the consequences. Most experiments tend to fall in between the strict and the wide definition. Experimental research design is centrally concerned with constructing research that is high in causal (internal) validity. Randomized experimental designs provide the highest levels of causal validity. Quasi‐experimental designs have a number of potential threats to their causal validity. Yet, new quasi‐experimental designs adopted from fields outside of criminology offer levels of causal validity that rival experimental designs. The design of research is fraught with complicated and crucial decisions. Researchers must decide which research questions to address, which theoretical perspective will guide the research, how to measure key constructs reliably and accurately, who or what to sample and observe, how many people/places/things need to be sampled in order to achieve adequate statistical power, and which data analytic techniques will be employed. These issues are germane to research of all types (exploratory, explanatory, descriptive, evaluation research). However, the term “research design” typically does not refer to the issues discussed above. The term “experimental research design” is centrally concerned with constructing research that is high in causal (or internal) validity. Causal validity concerns the accuracy of statements regarding cause and effect relationships. For example, does variable 1 cause variation in variable 2? Or does variable 2 cause variation in variable 1? Or does variable 3 cause variation in both variables 1 and 2? And what is the magnitude of the causal relationships among the variables? Thus, research design as used herein is a concern of explanatory and evaluation research but generally does not apply to exploratory or descriptive research. Criteria for Establishing Causal Inferences The three classic criteria necessary to support a causal inference, according to the philosopher John Stuart Mill, are: (1) Association (correlation), (2) Temporal order, and (3) Nonspuriousness. The criterion of association requires that there is a systematic relationship between the cause and effect variables. This criterion is by far the easiest to determine. The second criterion of temporal order is a bit more complicated. The temporal order criterion requires that the cause, or more precisely variation in the cause variable, must occur before the observed variation in the effect variable. The third criterion of no spuriousness is by far the most difficult to achieve. This criterion requires that the observed relationship between the cause and the effect variables must not be due to other omitted or unmeasured third variables. Using the relationship between delinquent peers and offending as an example, this criterion requires that this relationship cannot be due to homophily or any other potential explanation. Because there are usually many, many potentially relevant third variables and many of these third variables are unobserved, the criterion of no spuriousness can be quite difficult to achieve. Causal Relationship Causality is the relationship between cause and effect. Simple connections between cause and effect are linear and unidirectional. Complex connections between cause and effect, when organizations are thought of as systems, involve, circular causality, interdependent systems, and non-linearity. Nonlinearity is where one variable can have a more than proportional effect on another due to the very complex connections between cause and effect. With nonlinearity it may become unclear what cause and effect mean, the links between cause and effect may become distant in time and space, and the links between cause and effect may disappear for all practical purposes. The philosophical concept of causality or causation refers to the set of all particular “”causal“” or “”cause-and-effect“” relations. Most generally, causation is a relationship that holds between events, properties, variables, or states of affairs. According to Sowa (2000), up until the twentieth century, three assumptions described by Max Born in 1949 were dominant in the definition of causality: 1. Causality postulates that there are laws by which the occurrence of an entity B of a certain class depends on the occurrence of an entity A of another class, where the word entity means any physical object, phenomenon, situation, or event. A is called the cause, B the effect. 2. Antecedence postulates that the cause must be prior to, or at least simultaneous with, the effect. 3. Contiguity postulates that cause and effect must be in spatial contact or connected by a chain of intermediate things in contact.”” (Born, 1949, as cited in Sowa, 2000) Causality always implies at least some relationship of dependency between the cause and the effect. For example, deeming something a cause may imply that, all other things being equal, if the cause occurs the effect does as well, or at least that the probability of the effect occurring increases. However, according to Sowa (2000), “”relativity and quantum mechanics have forced physicists to abandon these assumptions as exact statements of what happens at the most fundamental levels, but they remain valid at the level of human experience. Expressing causal relationships: In natural languages, causal relationships can be expressed by the following causative expressions:- 1. A set of causative verbs [cause, make, create, do, effect, produce, occasion, perform, determine, influence; construct, compose, constitute; provoke, motivate, force, facilitate, induce, get, stimulate; begin, commence, initiate, institute, originate, start; prevent, keep, restrain, preclude, forbid, stop, cease]; 2. A set of causative names [actor, agent, author, creator, designer, former, originator; antecedent, causality, causation, condition, fountain, occasion, origin, power, precedent, reason, source, spring; reason, grounds, motive, need, impulse]; 3. A set of effective names [consequence, creation, development, effect, end, event, fruit, impact, influence, issue, outcome, outgrowth, product, result, upshot] Concept of Independent and Dependent Variable A variable is something you’re trying to measure. It can be practically anything, such as objects, amounts of time, feelings, events, or ideas. If you’re studying how people feel about different television shows, the variables in that experiment are television shows and feelings. If you’re studying how different types of fertilizer affect how tall plants grow, the variables are type of fertilizer and plant height. There are two key variables in every experiment 1. Independent Variables The independent variable and the dependent variable. The independent variable is the variable whose change isn’t affected by any other variable in the experiment. Either the scientist has to change the independent variable herself or it changes on its own; nothing else in the experiment affects or changes it. Two examples of common independent variables are age and time. There’s nothing you or anything else can do to speed up or slow down time or increase or decrease age. They’re independent of everything else. 2. Dependent Variable The dependent variable is what is being studied and measured in the experiment. It’s what changes as a result of the changes to the independent variable. An example of a dependent variable is how tall you are at different ages. The dependent variable (height) depends on the independent variable (age). An easy way to think of independent and dependent variables is, when you’re conducting an experiment, the independent variable is what you change, and the dependent variable is what changes because of that. You can also think of the independent variable as the cause and the dependent variable as the effect. It can be a lot easier to understand the differences between these two variables with examples, so let’s look at some sample experiments below. Examples of Independent and Dependent Variables in Experiments Below are overviews of three experiments, each with their independent and dependent variables identified. Experiment 1: You want to figure out which brand of microwave popcorn pops the most kernels so you can get the most value for your money. You test different brands of popcorn to see which bag pops the most popcorn kernels. • Independent Variable: Brand of popcorn bag (It’s the independent variable because you are actually deciding the popcorn bag brands) • Dependent Variable: Number of kernels popped (This is the dependent variable because it’s what you measure for each popcorn brand) Experiment 2: You want to see which type of fertilizer helps plants grow fastest, so you add a different brand of fertilizer to each plant and see how tall they grow. • Independent Variable: Type of fertilizer given to the plant • Dependent Variable: Plant height Experiment 3: You’re interested in how rising sea temperatures impact algae life, so you design an experiment that measures the number of algae in a sample of water taken from a specific ocean site under varying temperatures. • Independent Variable: Ocean temperature • Dependent Variable: The number of algae in the sample For each of the independent variables above, it’s clear that they can’t be changed by other variables in the experiment. You have to be the one to change the popcorn and fertilizer brands in Experiments 1 and 2, and the ocean temperature in Experiment 3 cannot be significantly changed by other factors. Changes to each of these independent variables cause the dependent variables to change in the experiments. Concomitant Variable, Extraneous Variable, Treatment and control groups CONCOMITANT VARIABLE A concomitant variable, or covariate, is a variable which we observe during the course of our research or statistical analysis, but we cannot control it and it is not the focus of our analysis. Although concomitant variables are not given any central recognition, they may be confounding or interacting with the variables being studied. Ignoring them can lead to skewed or biased data, and so they must often be corrected for in a final analysis. Examples of Concomitant Variables Let’s say you had a study which compares the salaries of male vs. female college graduates. The variables being studied are gender and salary, and the primary survey questions are related to these two main topics. But, since salaries increase the longer someone has been in the workplace, the concomitant variable ‘time out of college’ has the potential to skew our data if it is not accounted for. If this variable is observed, recorded for and accounted for in the final results, your conclusions will be more valid. Typically this is done by noting the concomitant variable (here, age) in the initial data gathering, and then running a regression to ‘equalize’ all of the data points to the same number of years out of college. Similarly, in a study comparing the effects of soil composition on the growth of tomatoes over 20 different locations country-wide, average temperatures and hours of sunlight available to each tomato patch would both be concomitant variables that would need to be included in a final analysis in order to get valid results. EXTRANEOUS VARIABLE An Extraneous Variable is something that the experimenter cannot control, which can have an effect on the overall outcome of the experiment. The main four extraneous variables are demand characteristics, experimenter effects, participant variables and situational variables. (i) Demand Characteristics: Environmental clues that may tell the participant what is expected of them, such as the environmental setting or the researches body language. This in turn can affect their behaviour. (ii) Experimenter Effects: When the researcher themselves affect the outcome by giving subconscious clues about how to behave. This may involve unintentionally asking leading questions that inform the participant of the desired result. (iii) Participant variables: Something about the participant that is out of the researcher’s control. For example, whilst researches may try and target individuals with a certain background for an experiment, existing variables such as their health, or prior knowledge, could affect the outcome. For example, a participant with prior knowledge of Milgram’s experiment would be an extraneous variable in a reimagining of the experiment. (iv) Situational Variables: Whilst the researcher may do their best to control an experiment (for example, controlling the time of day), situational variables can still affect the results. For example, a field experiment conducted at the same time of day across a week may experience sporadic weather or unexpected noise pollution, changing the mood/actions of the participants. TREATMENT Treatment group is a group that receives a treatment in an experiment. The “group” is made up of test subjects (people, animals, plants, cells etc.) and the “treatment” is the variable you are studying. For example, a human experimental group could receive a new medication, a different form of counseling, or some vitamin supplements. A plant treatment group could receive a new plant fertilizer, more sunlight, or distilled water. The group that does not receive the treatment is called the control group. In an experiment, the factor (also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels, i.e., different values of the factor. Combinations of factor levels are called treatments. Treatment Group Examples Example no. 1: – You are testing to see if a new plant fertilizer increases sunflower size. You put 20 plants of the same height and strain into a location where all the plants get the same amount of water and sunlight. One half of the plants–the control group–get the regular fertilizer. The other half of the plants–the experimental group–get the fertilizer you are testing. Example no. 2: – You are testing to see if a new drug works for asthma. You divide 100 volunteers into two groups of 50. One group of 50 gets the drug; they are the experimental group. The other 50 people get a sugar pill (a placebo); they are the CONTROL GROUP Control group, the standard to which comparisons are made in an experiment. Many experiments are designed to include a control group and one or more experimental groups; in fact, some scholars reserve the term experiment for study designs that include a control group. Ideally, the control group and the experimental groups are identical in every way except that the experimental groups are subjected to treatments or interventions believed to have an effect on the outcome of interest while the control group is not. Inclusion of a control group greatly strengthens researchers’ ability to draw conclusions from a study. Indeed, only in the presence of a control group can a researcher determine whether a treatment under investigation truly has a significant effect on an experimental group, and the possibility of making an erroneous conclusion is reduced. A typical use of a control group is in an experiment in which the effect of a treatment is unknown and comparisons between the control group and the experimental group are used to measure the effect of the treatment. For instance, in a pharmaceutical study to determine the effectiveness of a new drug on the treatment of migraines, the experimental group will be administered the new drug and the control group will be administered a placebo (a drug that is inert, or assumed to have no effect). Each group is then given the same questionnaire and asked to rate the effectiveness of the drug in relieving symptoms. If the new drug is effective, the experimental group is expected to have a significantly better response to it than the control group. Another possible design is to include several experimental groups, each of which is given a different dosage of the new drug, plus one control group. In this design, the analyst will compare results from each of the experimental groups to the control group. This type of experiment allows the researcher to determine not only if the drug is effective but also the effectiveness of different dosages. In the absence of a control group, the researcher’s ability to draw conclusions about the new drug is greatly weakened, due to the placebo effect and other threats to validity. Comparisons between the experimental groups with different dosages can be made without including a control group, but there is no way to know if any of the dosages of the new drug are more or less effective than the placebo. It is important that every aspect of the experimental environment be as alike as possible for all subjects in the experiment. If conditions are different for the experimental and control groups, it is impossible to know whether differences between groups are actually due to the difference in treatments or to the difference in environment. For example, in the new migraine drug study, it would be a poor study design to administer the questionnaire to the experimental group in a hospital setting while asking the control group to complete it at home. Such a study could lead to a misleading conclusion, because differences in responses between the experimental and control groups could have been due to the effect of the drug or could have been due to the conditions under which the data were collected. For instance, perhaps the experimental group received better instructions or was more motivated by being in the hospital setting to give accurate responses than the control group. A control group study can be managed in two different ways. In a single-blind study, the researcher will know whether a particular subject is in the control group, but the subject will not know. In a double-blind study, neither the subject nor the researcher will know which treatment the subject is receiving. In many cases, a double-blind study is preferable to a single-blind study, since the researcher cannot inadvertently affect the results or their interpretation by treating a control subject differently from an experimental subject Unit 3 Scaling and Measurement Technique Scaling and Measurement Techniques: Needs of Measurement Measurement can be defined as a process of associating numbers to observations obtained in a research study. The variables associated with a study are classified into two basic categories: 1. Quantitative/ Numeric 2. Qualitative / Categorical Incidentally, only quantitative variables can be measured with the help of standard counting devices and qualitative variables can only be observed, there is no standard device or instrument to measure them. For example, in case of human beings, there are certain Quantitative (physical) characteristics like height, weight etc. and there are certain qualitative (abstract) characteristics like beauty, attitude, creativity etc. Like human beings, a business organization has also some Physical characteristics like employees, sales, offices etc. Being physical in nature these are easily measurable. However, there are certain abstract characteristics like reputation of the employees, image of the entity, motivation, work culture, commitment, trust, customer’s perception, feelings of customers. All these are extremely important because they help the company to stay afloat and grow. Therefore characteristics have to be measured for their meaningful assessment. This can be done by assigning some numbers and forming scales. Classification or Types of Measurement Scales All measurement scales can be classified into the following four categories: (i) Nominal (ii) Ordinal (iii) Interval (iv) Ratio Properties of Scales:- • Distinctive classification • Order • Equal distance • Fixed origin Measurement is the process observing and recording the observations that are collected as part of a research effort. There are two major issues that will be considered here. First, you have to understand the fundamental ideas involved in measuring. Here we consider two of major measurement concepts. In Levels of Measurement, I explain the meaning of the four major levels of measurement: nominal, ordinal, interval and ratio. Then we move on to the reliability of measurement, including consideration of true score theory and a variety of reliability estimators. Second, you have to understand the different types of measures that you might use in social research. We consider four broad categories of measurements. Survey research includes the design and implementation of interviews and questionnaires. Scaling involves consideration of the major methods of developing and implementing a scale. Qualitative research provides an overview of the broad range of non-numerical measurement approaches. And unobtrusive measures presents a variety of measurement methods that don’t intrude on or interfere with the context of the research. Problem in Measurement in Management Research: Validity And Reliability Problems in Measurement should be precise and unambiguous in an ideal research study. This objective, however, is often not met with in entirety. As such the researcher must be aware about the sources of error in measurement. The following are the possible sources of error in measurement. • Respondent: At times the respondent may be reluctant to express strong negative feelings or it is just possible that he may have very little knowledge but may not admit his ignorance. All this reluctance is likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom, anxiety, etc. may limit the ability of the respondent to respond accurately and fully. • Situation: Situational factors may also come in the way of correct measurement. Any condition which places a strain on interview can have serious effects on the interviewer-respondent rapport. For instance, if someone else is present, he can distort responses by joining in or merely by being present. If the respondent feels that anonymity is not assured, he may be reluctant to express certain feelings. • Measurer: The interviewer can distort responses by rewording or reordering questions. His behaviour, style and looks may encourage or discourage certain replies from respondents. Careless mechanical processing may distort the findings. Errors may also creep in because of incorrect coding, faulty tabulation and/or statistical calculations, particularly in the data-analysis stage. • Instrument: Error may arise because of the defective measuring instrument. The use of complex words, beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for replies, response choice omissions, etc. are a few things that make the measuring instrument defective and may result in measurement errors. Another type of instrument deficiency is the poor sampling of the universe of items of concern. • Researcher must know that correct measurement depends on successfully meeting all of the problems listed above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal with all the possible sources of error so that the final results may not be contaminated. RELIABILITY A test must also be reliable. Reliability is “Self-correlation of the test.” It shows the extent to which the results obtained are consisted when the test is administered. Once or more than once on the same sample with a reasonable gap. Consistency in results obtained in a single administration is the index of internal consistency of the test and consistency in results obtained upon testing and retesting is the index of temporal consistency. Reliability thus, includes both internal consistency as well as temporal consistency. A test to be called sound must be reliable because reliability indicates the extent to which the scores obtained in the test are free from such internal defects of standardization, which are likely to produce errors of measurement. Types of Reliability: (i) Internal reliability (ii) External reliability • Internal Reliability; Internal reliability assesses the consistency of results across items within a test. • External Reliability; External reliability refers to the extent to which a measure varies from one use to another. Errors in Reliability: At a time scores are not consistent because some other factors also affect reliability e.g. Noise Health Time There is always a chance of 5% error in reliability which is acceptable. VALIDITY Validity is another prerequisite for a test to be sound. Validity indicates the extent to which the test measure what it intends to measure, when compared with some outside independent criteria. In other words it is the correlation of the test with some outside criteria. The criteria should be independent one and should be regarded as the best index of trait or ability being measured by the test. Generally, validity of a test is dependent upon the reliability because a test which yields inconsistent results (poor reliability) is ordinarily not expected to correlate with some outside independent criteria. TYPES OF ERRORS (i) Random error (ii) Systematic error (i) Random error Random error exists in every measurement and is often major source of uncertainty. These errors have no particular assignable cause. These errors can never be totally eliminated or corrected. These are caused by many uncontrollable variables that are inevitable part of every analysis made by human being. These variables are impossible to identified, even if we identify some they cannot be measured because most of them are so small. (ii) Systematic error Systematic error is caused due to instruments, machines, and measuring tools. It is not due to individuals. Systematic error is acceptable we can fix and handled it. WAYS OF FINDING RELIABILITY: Following are the methods to check reliability • Test-retest • Alternate form • Split –half method TEST-RETEST METHOD It is the oldest and commonly used method of testing reliability. The test retest method assesses the external consistency of a test. Examples of appropriate tests include questionnaires and psycho metric tests. It measures the stability of a test over time. A typical assessment would involve giving participants the same test on two separate occasions. Each and every thing from start to end will be same in both tests. Results of first test need to be correlated with the result of second test. If the same or similar results are obtained then external reliability is established. The timing of the test is important if the duration is to brief then participants may recall information from the first test which could bias the results. Alternatively, if the duration is too long it is feasible that the participants could have changed in some important way which could also bias the results. Utility and worth of a psychological test decreases with time so the test should be revised and updated. When tests are not revised systematic error may arise. ALTERNATE FORM In alternate form two equivalent forms of the test are administered to the same group of examinees. An individual has given one form of the test and after a period of time the person is given a different version of the same test. The two form of the rest are then correlated to yield a coefficient of equivalence. Positive point In alternate form no deal to wait for time. Negative point It is very hectic and risky task to make two test of equivalent level. SPLIT-HALF METHOD The split half method assesses the internal consistency of a test. It measures the extent to which all parts of the test contribute equally to what is being measured. The test is technically spitted into odd and even form. The reason behind this is when we making test we always have the items in order of increasing difficulty if we put (1,2,—-10) in one half and (11,12,—-20) in another half then all easy question/items will goes to one group and all difficult questions/items will goes to the second group. When we split the test we should split it with same format/theme e.g. Multiple questions – multiple questions or blanks – blanks. Problem in Measurement in Management Research: Validity And Reliability Problems in Measurement should be precise and unambiguous in an ideal research study. This objective, however, is often not met with in entirety. As such the researcher must be aware about the sources of error in measurement. The following are the possible sources of error in measurement. • Respondent: At times the respondent may be reluctant to express strong negative feelings or it is just possible that he may have very little knowledge but may not admit his ignorance. All this reluctance is likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom, anxiety, etc. may limit the ability of the respondent to respond accurately and fully. • Situation: Situational factors may also come in the way of correct measurement. Any condition which places a strain on interview can have serious effects on the interviewer-respondent rapport. For instance, if someone else is present, he can distort responses by joining in or merely by being present. If the respondent feels that anonymity is not assured, he may be reluctant to express certain feelings. • Measurer: The interviewer can distort responses by rewording or reordering questions. His behaviour, style and looks may encourage or discourage certain replies from respondents. Careless mechanical processing may distort the findings. Errors may also creep in because of incorrect coding, faulty tabulation and/or statistical calculations, particularly in the data-analysis stage. • Instrument: Error may arise because of the defective measuring instrument. The use of complex words, beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for replies, response choice omissions, etc. are a few things that make the measuring instrument defective and may result in measurement errors. Another type of instrument deficiency is the poor sampling of the universe of items of concern. • Researcher must know that correct measurement depends on successfully meeting all of the problems listed above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal with all the possible sources of error so that the final results may not be contaminated. RELIABILITY A test must also be reliable. Reliability is “Self-correlation of the test.” It shows the extent to which the results obtained are consisted when the test is administered. Once or more than once on the same sample with a reasonable gap. Consistency in results obtained in a single administration is the index of internal consistency of the test and consistency in results obtained upon testing and retesting is the index of temporal consistency. Reliability thus, includes both internal consistency as well as temporal consistency. A test to be called sound must be reliable because reliability indicates the extent to which the scores obtained in the test are free from such internal defects of standardization, which are likely to produce errors of measurement. . TYPES OF ERRORS (i) Random error (ii) Systematic error (i) Random error Random error exists in every measurement and is often major source of uncertainty. These errors have no particular assignable cause. These errors can never be totally eliminated or corrected. These are caused by many uncontrollable variables that are inevitable part of every analysis made by human being. These variables are impossible to identified, even if we identify some they cannot be measured because most of them are so small. (ii) Systematic error Systematic error is caused due to instruments, machines, and measuring tools. It is not due to individuals. Systematic error is acceptable we can fix and handled it. WAYS OF FINDING RELIABILITY: Following are the methods to check reliability • Test-retest • Alternate form • Split –half method TEST-RETEST METHOD It is the oldest and commonly used method of testing reliability. The test retest method assesses the external consistency of a test. Examples of appropriate tests include questionnaires and psycho metric tests. It measures the stability of a test over time. A typical assessment would involve giving participants the same test on two separate occasions. Each and every thing from start to end will be same in both tests. Results of first test need to be correlated with the result of second test. If the same or similar results are obtained then external reliability is established. The timing of the test is important if the duration is to brief then participants may recall information from the first test which could bias the results. Alternatively, if the duration is too long it is feasible that the participants could have changed in some important way which could also bias the results. Utility and worth of a psychological test decreases with time so the test should be revised and updated. When tests are not revised systematic error may arise. ALTERNATE FORM In alternate form two equivalent forms of the test are administered to the same group of examinees. An individual has given one form of the test and after a period of time the person is given a different version of the same test. The two form of the rest are then correlated to yield a coefficient of equivalence. Positive point In alternate form no deal to wait for time. Negative point It is very hectic and risky task to make two test of equivalent level. SPLIT-HALF METHOD The split half method assesses the internal consistency of a test. It measures the extent to which all parts of the test contribute equally to what is being measured. The test is technically spitted into odd and even form. The reason behind this is when we making test we always have the items in order of increasing difficulty if we put (1,2,—-10) in one half and (11,12,—-20) in another half then all easy question/items will goes to one group and all difficult questions/items will goes to the second group. When we split the test we should split it with same format/theme e.g. Multiple questions – multiple questions or blanks – blanks. Attitude Scaling Techniques- Concept of Scale – Rating Scales Viz. Likert Scales Attitudes are individual mental processes which determine both the actual and potential response of each person in a social world. An attitude is always directed toward some object and therefore, attitude is the state of mind of the individual toward a value. Techniques: (a) Paired Comparison This method requires the respondent to choose one of stimuli that has more of some property over the other with respect to some designated property. Example: Hero Honda motor cycle dominates all other motor cycles. (b) Ordered Category Sorting It requires the respondent to order stimuli w.r.t. Some designated property of interest. Example: “A” Represents a set of dozen car tyres in terms of high grip, moderate grip and low grip etc. (c) Ranking Method This determines the perceived order of six brands of tyres w.r.t. gripness on application of brakes. Each respondent is asked to rank the tyre brands w.r.t. gripness. (d) Rating Techniques Rating of different brands of motor cycles in terms of reliability, and fuel efficiency i.e., kms/litre of petrol etc. RATING SCALE Rating scale is defined as a closed-ended survey question used to represent respondent feedback in a comparative form for specific particular features/products/services. It is one of the most established question types for online and offline surveys where survey respondents are expected to rate an attribute or feature. Rating scale is a variant of the popular multiple-choice question which is widely used to gather information that provides relative information about a specific topic. Researchers use a rating scale in research when they intend to associate a qualitative measure with the various aspects of a product or feature. Generally, this scale is used to evaluate the performance of a product or service, employee skills, customer service performances, processes followed for a particular goal etc. Rating scale survey question can be compared to a checkbox question but rating scale provides more information than merely Yes/No. Types of Rating Scale Broadly speaking, rating scales can be divided into two categories: Ordinal and Interval Scales. An ordinal scale is a scale the depicts the answer options in an ordered manner. The difference between the two answer option may not be calculable but the answer options will always be in a certain innate order. Parameters such as attitude or feedback can be presented using an ordinal scale. An interval scale is a scale where not only is the order of the answer variables established but the magnitude of difference between each answer variable is also calculable. Absolute or true zero value is not present in an interval scale. Temperature in Celsius or Fahrenheit is the most popular example of an interval scale. Net Promoter Score, Likert Scale, Bipolar Matrix Table are some of the most effective types of interval scale. There are four primary types of rating scales which can be suitably used in an online survey: • Graphic Rating Scale • Numerical Rating Scale • Descriptive Rating Scale • Comparative Rating Scale LIKERT SCALE A Likert Scale is a scale used to measure the attitude wherein the respondents are asked to indicate the level of agreement or disagreement with the statements related to the stimulus objects. The Likert Scale was named after its developer, Rensis Likert. It is typically a five response category scale ranging from “strongly disagree” to “strongly agree”. The purpose of a Likert scale is to identify the attitude of people towards the given stimulus objects by asking them the extent to which they agree or disagree with them. Often, the respondents are presented with questionnaires containing the set of statements to rate their attitude towards the objects. For example, the respondents might be asked to rate their purchase experience with shoppers stop by assigning the score as (1= strongly disagree, 2= disagree, 3= neither agree nor disagree, 4= agree, 5= strongly agree) to the series of statements given below: • Shoppers stop sell high-quality merchandise. • I like to shop from shoppers stop. • It offers several credit schemes. • It charges fair prices. • I like the way shoppers stop advertises its products. The data obtained from the Likert Scale are typically treated as the interval. Thus, we can say that Likert scale possesses description, order and distance characteristics. Description means the unique labels or tags designated to each value of the scale; Order means the relative position of the descriptor and Distance implies that the absolute differences between the descriptors is known and can be expressed in units. For the purpose of analysis, each statement is allotted a numerical score ranging from either 1 to 5 or -2 to +2. The analysis could be done item wise, or a total score can be computed by summing up all the items for each respondent. One of the advantages of a Likert scale is that it is easy to construct and administer. The major limitation of this scaling technique is that it is time-consuming and requires much more time as compared to other itemized scaling techniques. This is because each respondent is required to read every statement given in a questionnaire before assigning a numerical value to it. Another limitation of a Likert scale is that it could be misunderstood at times, especially when the responses are unfavorable. Semantic Differential Scales Semantic Differential Scale is a survey or questionnaire rating scale that asks people to rate a product, company, brand or any “entity” within the frames of a multi-point rating options. These survey answering options are grammatically on opposite adjectives at each end. For example, love / hate, satisfied / unsatisfied and likely to return / unlikely to return with intermediate options in between. Surveys or questionnaires using Semantic Differential Scale is the most reliable way to get information on people’s emotional attitude towards a topic of interest. Charles Egerton Osgood, a famous American psychologist, invented the semantic differential scale so that this “connotative meaning” of emotional attitude towards entities can be recorded and put to good use. This research was conducted on a large database and Osgood found that there are 3 scales that were commonly effective, irrespective of race or culture or difference in language: • Estimate: Combination similar to “good-bad” • Authority: Pairs on the lines of “powerful-weak’ • Activeness: Combos like “active-passive” A wide variation of subjects can be measured using these combinations like customers’ outlook about an upcoming product launch or employee satisfaction. Where to use Semantic Differential Scale? The ease-of-understanding and the popularity it comes with it, makes it extremely reliable. Due to the versatility that these survey questions come with, make the data collected very accurate. Semantic differential scale questions are used to ask respondents to rate your products, organization or services with multi-point questions with polar adjectives at the extremes of this scale like likely/ unlikely, happy/sad, loved the service/ hated the service. Advantages of Semantic Differential Scale 1. Semantic Differential Scale has outdone the other scales like Likert Scale in terms of vitality, rationality or authenticity. 2. It has an advantage in terms of language too. There are two polar adjectives for the factor to be measured and a scale connecting both these polar. 3. This is more advantageous than Likert scale where a statement is declared in the statement under question and the respondents are expected to either agree or disagree to that. 4. Respondents can express their opinions about the matter in hand more specifically and fully due to the polar options provided in the Semantic Differential Scale questions. 5. In other question types like the Likert Scale, respondents have to be indicate the level of agreement or disagreement they have with the mentioned topic. While, the Semantic differential scale offers extremely opposite adjectives on each end of the scale due to which the respondents can precisely explain their feedback that can be used for making accurate judgments from the survey. Constant Sum Scale A constant sum scale is a type of question used in a market research survey in which respondents are required to divide a specific number of points or percent’s as part of a total sum. The allocation of points are divided to detail the variance and weight of each category. Constant sum scales are a less frequently used question in surveys when compared to basic likert scales, single radio responses, or checklists (i.e. multiple response options). They are an excellent way to create variance among a data set and truly understand which factors are key and which are not for customers or respondents. They are especially helpful if you need to ask a question to a customer or respondent where you believe several factors are critical or of high importance. You are more likely to create differentiation in the data with a constant sum when compared to other question types. Example of a Constant Sum Scale Option Finally, let’s give an example of the constant sum scale question. This is asked in the following manner in your survey. It forces the respondent to slow down a bit and think about how important each factor is as they allot points. Q: Using 100 points, please apply a number of points to each factor based on how important each are to you when buying a home. You must total 100 points divided among the factors. A: Price, Location, School District, Inside Features, etc. The respondent is given 100 points. They may choose to apply 80 to price, 15 to location, and spread out the remaining 5 points among other factors. When you analyze this data set, the differentiation between factors becomes evident. Most survey software will automatically tally and sum the point values to ensure they add to a constant sum of 100. Graphic Rating Scale Graphic Rating Scale is a type of performance appraisal method. In this method traits or behavior’s that are important for effective performance are listed out and each employee is rated against these traits. The rating helps employers to quantify the behaviors displayed by its employees. Some of these behaviours might be: • Quality of work • Teamwork • Sense of responsibility • Ethics etc. Characteristics of a Good Graphic Rating scale are: • Performance evaluation measures against which an employee has to be rated must be well defined. • Scales should be behaviourally based. • Ambiguous behaviours definitions, such as loyalty, honesty etc. should be avoided. • Ratings should be relevant to the behaviour being measured. For example, to measure “English Speaking Skill” rates should be fluent, hesitant, and laboured instead of excellent, average and poor. Example of a Graphic Rating Scale question How would you rate the individual in terms of quality of work, neatness and accuracy? (i) Non-Existent: Careless Worker. Tends to repeat similar mistakes (ii) Average: Work is sometimes unsatisfactory due to untidiness (iii) Good: Work is acceptable. Not many errors (iv) Very Good: Reliable worker. Good quality of work. Checks work and observes. (v) Excellent: Work is of high quality. Errors are rare, if any. Little wasted effort. Advantages: • The method is easy to understand and is user friendly. • Standardization of the comparison criteria’s. • Behaviours are quantified making appraisal system easier. Disadvantages:- • Judgmental error: Rating behaviours may or may not be accurate as the perception of behaviours might vary with judges. • Difficulty in rating: Rating against labels like excellent and poor is difficult at times even tricky as the scale does not exemplify the ideal behaviours required for a achieving a rating. • Perception issues: Perception error like Halo effect, Recency effect, stereotyping etc. can cause incorrect rating. • They are good at identifying the best and poorest of employees. However, it does not help while differentiating the average employees. • Not effective in understanding the strengths of employees. Different employees have different strong characteristics and these might quantify to the same score. Ranking Scale A ranking scale is a survey question tool that measures people’s preferences by asking them to rank their views on a list of related items. Using these scales can help your business establish what matters and what doesn’t matter to either external or internal stakeholders. You could use ranking scale questions to evaluate customer satisfaction or to assess ways to motivate your employees, for example. Ranking scales can be a source of useful information, but they do have some disadvantages. Ranking Scale Businesses typically use ranking scales when they want to establish preferences or levels of importance in a group of items. A respondent completing a scale with five items, for example, will assign a number 1 through 5 to each individual one. Typically, the number 1 goes to the item that is most important to the respondent; the number 5 goes to the one that is of least importance. In some cases, scales do not force respondents to rank all items, asking them to choose their top three out of the five, for example. Online surveys may remove the need to key in numbers, allowing respondents to drag and drop items into order. Advantages of Ranking Scales Ranking scales give you an insight into what matters to your respondents. Each response to an item has an individual value, giving results that you can easily average and rank numerically. This can be a valuable business tool, as it gives a statistical breakdown of your audience’s preferences based on what you need to know. If you are making business decisions and have various options to choose from, data from a ranking scale might give you a clearer insight into how to satisfy your audience based on what is important to them. Disadvantages of Ranking Scales Ranking scales cannot tell you why something is important or unimportant to respondents. They address items in relation to each other rather than individually, and they may not give fully accurate results. Respondents cannot give the same rating to two items, even if they are of equal importance to them. There is no way to measure how much of a distance there is between levels of importance for each rating, even though this may be variable. Survey results may suffer from “order bias,” where respondents rank the first set of items more positively than later ones. This may also be a problem if you ask respondents to rank too many items at once, because they may lose focus. Unit 4 Sampling Sampling: Basic Concept: Defining the Universe Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. The methodology used to sample from a larger population depends on the type of analysis being performed but may include simple random sampling or systematic sampling. In business, a CPA performing an audit uses sampling to determine the accuracy of account balances in the financial statements, and managers use sampling to assess the success of the firm’s marketing efforts. The sample should be a representation of the entire population. When taking a sample from a larger population, it is important to consider how the sample is chosen. To get a representative sample, the sample must be drawn randomly and encompass the whole population. For example, a lottery system could be used to determine the average age of students in a university by sampling 10% of the student body. A good sample is one which satisfies all or few of the following conditions- (i) Representativeness: When sampling method is adopted by the researcher, the basic assumption is that the samples so selected out of the population are the best representative of the population under study. Thus good samples are those who accurately represent the population. Probability sampling technique yield representative samples. On measurement terms, the sample must be valid. The validity of a sample depends upon its accuracy. (ii) Accuracy: Accuracy is defined as the degree to which bias is absent from the sample. An accurate (unbiased) sample is one which exactly represents the population. It is free from any influence that causes any differences between sample value and population value. (iii) Size: A good sample must be adequate in size and reliable. The sample size should be such that the inferences drawn from the sample are accurate to a given level of confidence to represent the entire population under study. The size of sample depends on number of factors. Some important among them are:- (i) Homogeneity or Heterogeneity of the universe: Selection of sample depends on the nature of the universe. It says that if the nature of universe is homogeneous then a small sample will represent the behavior of entire universe. This will lead to selection of small sample size rather than a large one. On the other hand, if the universe is heterogeneous in nature then samples are to be chosen as from each heterogeneous unit. (ii) Number of classes proposed: If a large number of class intervals to be made then the size of sample should be more because it has to represent the entire universe. In case of small samples there is the possibility that some samples may not be included. (iii) Nature of study: The size of sample also depends on the nature of study. For an intensive study which may be for a long time, large samples are to be chosen. Similarly, in case of general studies large number of respondents may be appropriate one but if the study is of technical in nature then the selection of large number of respondents may cause difficulty while gathering information. Sampling is the act, process, or technique of selecting a representative part of a population for the purpose of determining the characteristics of the whole population. In other words, the process of selecting a sample from a population using special sampling techniques called sampling. It should be ensured in the sampling process itself that the sample selected is representative of the population. Examples of Sample Tests for Marketing Businesses aim to sell their products and/or services to target markets. Before presenting products to the market, companies generally identify the needs and wants of their target audience. To do so, they may employ using a sample of the population to gain a better understanding of those needs to later create a product and/or service that meets those needs. Gathering the opinions of the sample helps to identify the needs of the whole. UNIVERSE OR POPULATION The population or universe represents the entire group of units which is the focus of the study. Thus, the population could consist of all the persons in the country, or those in a particular geographical location, or a special ethnic or economic group, depending on the purpose and coverage of the study. A population could also consist on non-human units such as farms, houses or business establishments. The entire aggregation of items from which samples can be drawn is known as a population. In sampling, the population may refer to the units, from which the sample is drawn. Population or populations of interest are interchangeable terms. The term “unit” is used, as in a business research process, samples are not necessarily people all the time. A population of interest may be the universe of nations or cities. This is one of the first things the analyst needs to define properly while conducting a business research. Therefore, population, contrary to its general notion as a nation’s entire population has a much broader meaning in sampling. “N” represents the size of the population. Concept of Statistical Population In statistics, a population is a set of similar items or events which is of interest for some question or experiment. A statistical population can be a group of existing objects (e.g. the set of all stars within the Milky Way galaxy) or a hypothetical and potentially infinite group of objects conceived as a generalization from experience (e.g. the set of all possible hands in a game of poker). A common aim of statistical analysis is to produce information about some chosen population. In statistical inference, a subset of the population (a statistical sample) is chosen to represent the population in a statistical analysis. The ratio of the size of this statistical sample to the size of the population is called a sampling fraction. It is then possible to estimate the population parameters using the appropriate sample statistics. In statistics, the term population is used to describe the subjects of a particular study—everything or everyone who is the subject of a statistical observation. Populations can be large or small in size and defined by any number of characteristics, though these groups are typically defined specifically rather than vaguely—for instance, a population of women over 18 who buy coffee at Starbucks rather than a population of women over 18. Statistical populations are used to observe behaviors, trends, and patterns in the way individuals in a defined group interact with the world around them, allowing statisticians to draw conclusions about the characteristics of the subjects of study, although these subjects are most often humans, animals, and plants, and even objects like stars. Overview: Statistical Population Function Statistical Analysis Definition A set of observations that share a property or set of properties. Example Coffee drinkers in France Value Targeting a set of data for the purposes of analysis. Related Techniques Statistical Model Probability Distribution Sample, Characteristics of a Good Sample Sample A sample is a smaller, manageable version of a larger group. It is a subset containing the characteristics of a larger population. Samples are used in statistical testing when population sizes are too large for the test to include all possible members or observations. A sample should represent the whole population and not reflect bias toward a specific attribute. In basic terms, a population is the total number of individuals, animals, items, observation, data, etc. of any given subject. For example, as of 2017, the population of the world was 7.5 billion of which 49.6% were female and 50.4% were male. The total number of people in any given country can also be a population size. The total number of students in a city can be taken as a population, and the total number of dogs in a city is also a population size. Scientists, researchers, marketers, academicians, and any related or interested party trying to draw data from a group will find that a population size may be too large to monitor. Consider a team of academic researchers that want to, say, know the number of students that studied for less than 40 hours for the CFA exam in 2016 and still passed. Since more than 200,000 people globally take the exam each year, reaching out to each and every exam participant might be extremely tedious and time consuming. In fact, by the time the data from the population has been collected and analyzed, a couple of years would have passed, making the analysis worthless since a new population would have emerged. Characteristics of a Good Sample (1) Goal-oriented: A sample design should be goal oriented. It is means and should be oriented to the research objectives and fitted to the survey conditions. (2) Accurate representative of the universe: A sample should be an accurate representative of the universe from which it is taken. There are different methods for selecting a sample. It will be truly representative only when it represents all types of units or groups in the total population in fair proportions. In brief sample should be selected carefully as improper sampling is a source of error in the survey. (3) Proportional: A sample should be proportional. It should be large enough to represent the universe properly. The sample size should be sufficiently large to provide statistical stability or reliability. The sample size should give accuracy required for the purpose of particular study. (4) Random selection: A sample should be selected at random. This means that any item in the group has a full and equal chance of being selected and included in the sample. This makes the selected sample truly representative in character. (5) Economical: A sample should be economical. The objectives of the survey should be achieved with minimum cost and effort. (6) Practical: A sample design should be practical. The sample design should be simple i.e. it should be capable of being understood and followed in the fieldwork. (7) Actual information provider: A sample should be designed so as to provide actual information required for the study and also provide an adequate basis for the measurement of its own reliability. In brief, a good sample should be truly representative in character. It should be selected at random and should be adequately proportional. These, in fact, are the attributes of a good sample. Sampling Frame (Practical Approach for Determining the Sample Frame Expected) When developing a research study, one of the first things that you need to do is clarify all of the units (also referred to as cases) that you are interested in studying. Units could be people, organizations, or existing documents. In research, these units make up the population of interest. When defining the population, it’s really important to be as specific as possible. The problem is it’s not always possible or feasible to study every unit in a population. For example, you might be interested in American college students’ attitudes about owning houses. It would obviously be too time-consuming and costly to collect information from every college student in the United States. In cases like these, you can study a portion or subset of the population called a sample. The process of selecting a sample needs to be deliberate, and there are various sampling techniques that you can use depending upon the purpose of the research. Prior to selecting a sample you need to define a sampling frame, which is a list of all the units of the population of interest. You can only apply your research findings to the population defined by the sampling frame. Qualities of a Good Sampling Frame You can’t just use any list you come across! Care must be taken to make sure your sampling frame is adequate for your needs. • Include all individuals in the target population. • Exclude all individuals not in the target population. • Includes accurate information that can be used to contact selected individuals. Other general factors that you would want to make sure you have: • A unique identifier for each member. This could be a simple numerical identifier (i.e. from 1 to 1000). Check to make sure there are no duplicates in the frame. • A logical organization to the list. For example, put them in alphabetical order. • Up to date information. This may need to be periodically checked (i.e. for address changes). In some cases, it might be impossible, or very difficult, to get a sampling frame. For example, getting a list of prostitutes in your city isn’t likely (mostly because of the fact that most prostitutes won’t want to be found). Sometimes, techniques like snowball sampling must be used to make up for the lack of sampling frame. Snowball sampling is where you find one person (or a few people) for your survey or experiment. You then ask them to find someone else who would be willing to participate. Then that person finds someone else, and so on, until you have enough people for your needs. Sampling Errors, Non-Sampling Errors, Methods to Reduce the Error A Sampling error is a statistical error that occurs when an analyst does not select a sample that represents the entire population of data and the results found in the sample do not represent the results that would be obtained from the entire population. Sampling is an analysis performed by selecting a number of observations from a larger population, and the selection can produce both sampling errors and non-sampling errors. Sampling error can be eliminated when the sample size is increased and also by ensuring that the sample adequately represents the entire population. Assume, for example, that XYZ Company provides a subscription-based service that allows consumers to pay a monthly fee to stream videos and other programming over the web. The firm wants to survey homeowners who watch at least 10 hours of programming over the web each week and pay for an existing video streaming service. XYZ wants to determine what percentage of the population is interested in a lower-priced subscription service. If XYZ does not think carefully about the sampling process, several types of sampling errors may occur. Examples of Sampling Error A population specification error means that XYZ does not understand the specific types of consumers who should be included in the sample. If, for example, XYZ creates a population of people between the ages of 15 and 25 years old, many of those consumers do not make the purchasing decision about a video streaming service because they do not work full-time. On the other hand, if XYZ put together a sample of working adults who make purchase decisions, the consumers in this group may not watch 10 hours of video programming each week. Selection error also causes distortions in the results of a sample, and a common example is a survey that only relies on a small portion of people who immediately respond. If XYZ makes an effort to follow up with consumers who don’t initially respond, the results of the survey may change. Furthermore, if XYZ excludes consumers who don’t respond right away, the sample results may not reflect the preferences of the entire population. Sample Size and Sampling Error Given two exactly the same studies, same sampling methods, same population, the study with a larger sample size will have less sampling process error compared to the study with smaller sample size. Keep in mind that as the sample size increases, it approaches the size of the entire population, therefore, it also approaches all the characteristics of the population, thus, decreasing sampling process error. Non-Sampling Errors A non-sampling error is an error that results during data collection, causing the data to differ from the true values. Non-sampling error differs from sampling error. A sampling error is limited to any differences between sample values and universe values that arise because the entire universe was not sampled. Sampling error can result even when no mistakes of any kind are made. The “errors” result from the mere fact that data in a sample is unlikely to perfectly match data in the universe from which the sample is taken. This “error” can be minimized by increasing the sample size. Non-sampling errors cover all other discrepancies, including those that arise from a poor sampling technique. Non-sampling errors may be present in both samples and censuses in which an entire population is surveyed and may be random or systematic. Random errors are believed to offset each other and therefore are of little concern. Systematic errors, on the other hand, affect the entire sample and are therefore present a greater issue. Non-sampling errors can include but are not limited to, data entry errors, biased survey questions, biased processing/decision making, non-responses, inappropriate analysis conclusions and false information provided by respondents. While increasing sample size will help minimize sampling error, it will not have any effect on reducing non-sampling error. Unfortunately, non-sampling errors are often difficult to detect, and it is virtually impossible to eliminate them entirely. Methods to Reduce Sampling Error Of the two types of errors, sampling error is easier to identify. The biggest techniques for reducing sampling error are: (i) Increase the sample size. A larger sample size leads to a more precise result because the study gets closer to the actual population size. (ii) Divide the population into groups. Instead of a random sample, test groups according to their size in the population. For example, if people of a certain demographic make up 35% of the population, make sure 35% of the study is made up of this variable. (iii) Know your population. The error of population specification is when a research team selects an inappropriate population to obtain data. Know who buys your product, uses it, works with you, and so forth. With basic socio-economic information, it is possible to reach a consistent sample of the population. In cases like marketing research, studies often relate to one specific population like Facebook users, Baby Boomers, or even homeowners. Methods to Non- Reduce Sampling Error (i) Thoroughly Pretest your Survey Mediums As discussed in the example above, it is very important to ensure that your survey and its invites run smoothly through any medium or on any device your potential respondents might use. People are much more likely to ignore survey requests if loading times are long, questions do not fit properly on their screens, or they have to work to make the survey compatible with their device. The best advice is to acknowledge your sample`s different forms of communication software and devices and pre-test your surveys and invites on each, ensuring your survey runs smoothly for all your respondents. (ii) Avoid Rushed or Short Data Collection Periods One of the worst things a researcher can do is limit their data collection time in order to comply with a strict deadline. Your study’s level of nonresponse bias will climb dramatically if you are not flexible with the time frames respondents have to answer your survey. Fortunately, flexibility is one of the main advantages to online surveys since they do not require interviews (phone or in person) that must be completed at certain times of the day. However, keeping your survey live for only a few days can still severely limit a potential respondent’s ability to answer. Instead, it is recommended to extend a survey collection period to at least two weeks so that participants can choose any day of the week to respond according to their own busy schedule. (iii) Send Reminders to Potential Respondents Sending a few reminder emails throughout your data collection period has been shown to effectively gather more completed responses. It is best to send your first reminder email midway through the collection period and the second near the end of the collection period. Make sure you do not harass the people on your email list who have already completed your survey! You can manage your reminders and invites on FluidSurveys through the trigger options found in the invite tool. (iv) Ensure Confidentiality Any survey that requires information that is personal in nature should include reassurance to respondents that the data collected will be kept completely confidential. This is especially the case in surveys that are focused on sensitive issues. Make certain someone reading your invite understands that the information they provide will be viewed as part the whole sample and not individually scrutinized. (v) Use Incentives Many people refuse to respond to surveys because they feel they do not have the time to spend answering questions. An incentive is usually necessary to motivate people into taking part in your study. Depending on the length of the survey, the difficulty in finding the correct respondents (ie: one-legged, 15th-century spoon collectors), and the information being asked, the incentive can range from minimal to substantial in value. Remember, most respondents won’t have an invested interest in your study and must feel that the survey is worth their time! Sample Size Constraints, Non-Response Effects of Small Sample Size In the formula, the sample size is directly proportional to Z-score and inversely proportional to the margin of error. Consequently, reducing the sample size reduces the confidence level of the study, which is related to the Z-score. Decreasing the sample size also increases the margin of error. In short, when researchers are constrained to a small sample size for economic or logistical reasons, they may have to settle for less conclusive results. Whether or not this is an important issue depends ultimately on the size of the effect they are studying. For example, a small sample size would give more meaningful results in a poll of people living near an airport who are affected negatively by air traffic than it would in a poll of their education levels. Effect of Large Sample Size There is a widespread belief that large samples are ideal for research or statistical analysis. However, this is not always true. Using the above example as a case study, very large samples that exceed the value estimated by sample size calculation present different hurdles. The first is ethical. Should a study be performed with more patients than necessary? This means that more people than needed are exposed to the new therapy. Potentially, this implies increased hassle and risk. Obviously the problem is compounded if the new protocol is inferior to the traditional method: More patients are involved in a new, uncomfortable therapy that yields inferior results. The second obstacle is that the use of a larger number of cases can also involve more financial and human resources than necessary to obtain the desired response. In addition to these factors, there is another noteworthy issue that has to do with statistics. Statistical tests were developed to handle samples, not populations. When numerous cases are included in the statistics, analysis power is substantially increased. This implies an exaggerated tendency to reject null hypotheses with clinically negligible differences. What is insignificant becomes significant. Thus, a potential statistically significant difference in the ANB angle of 0.1° between the groups cited in the previous example would obviously produce no clinical difference in the effects of wearing an appliance. When very large samples are available in a retrospective study, the researcher needs first to collect subsamples randomly, and only then perform the statistical test. If it is a prospective study, the researcher should collect only what is necessary, and include a few more individuals to compensate for subjects that leave the study. CONCLUSIONS In designing a study, sample size calculation is important for methodological and ethical reasons, as well as for reasons of human and financial resources. When reading an article, the reader should be on the alert to ascertain that the study they are reading was subjected to sample size calculation. In the absence of this calculation, the findings of the study should be interpreted with caution. An appropriate sample renders the research more efficient: Data generated are reliable, resource investment is as limited as possible, while conforming to ethical principles. The use of sample size calculation directly influences research findings. Very small samples undermine the internal and external validity of a study. Very large samples tend to transform small differences into statistically significant differences – even when they are clinically insignificant. As a result, both researchers and clinicians are misguided, which may lead to failure in treatment decisions. NON-RESPONSE BA lot of things can go wrong in a survey. One of the most important problems is non-response. It is the phenomenon that the required information is not obtained from the persons selected in the sample. The consequences of non-response One effect of non-response is that is reduces the sample size. This does not lead to wrong conclusions. Due to the smaller sample size, the precision of estimators will be smaller. The margins of error will be larger. A more serious effect of non-response is that it can be selective. This occurs if, due to non-response, specific groups are under- or over-represented in the survey. If these groups behave differently with respect to the survey variables, this causes estimators to be biased. To say it in other word: estimates are significantly too high or too low. Example: surveys of Statistics Netherlands Selective non-response is not uncommon. It occurs in a number of surveys of Statistics Netherlands. A follow-up study of the Dutch Victimization Survey showed that persons, who are afraid to be home alone at night, are less inclined to participate in the survey. In the Dutch Housing Demand Survey, it turned out that people who refused to participate, have lesser housing demands than people who responded. And for the Survey of Mobility of the Dutch Population it was obvious that the more mobile people were under-represented among the respondents. Probability Sampling Probability Sampling is a sampling technique in which sample from a larger population are chosen using a method based on the theory of probability. For a participant to be considered as a probability sample, he/she must be selected using a random selection. The most important requirement of probability sampling is that everyone in your population has a known and an equal chance of getting selected. For example, if you have a population of 100 people every person would have odds of 1 in 100 for getting selected. Probability sampling gives you the best chance to create a sample that is truly representative of the population. Probability sampling uses statistical theory to select randomly, a small group of people (sample) from an existing large population and then predict that all their responses together will match the overall population. Probability Sampling Example Let us take an example to understand this sampling technique. The population of the US alone is 330 million, it is practically impossible to send a survey to every individual to gather information but you can use probability sampling to get data which is as good even if it is collected from a smaller population. For example, consider hypothetically an organization has 500,000 employees sitting at different geographic locations. The organization wishes to make certain amendment in its human resource policy, but before they roll out the change they wish to know if the employees will be happy with the change or not. However, it’s a tedious task to reach out to all 500,000 employees. This is where probability sampling comes handy. A sample from the larger population i.e from 500,000 employees can be chosen. This sample will represent the population. A survey now can be deployed to the sample. From the responses received, management will now be able to know whether employees in that organization are happy or not about the amendment. Steps involved in Probability Sampling 1. Choose your population of interest carefully: Carefully think and choose from the population, people you think whose opinions should be collected and then include them in the sample. 2. Determine a suitable sample frame: Your frame should include a sample from your population of interest and no one from outside in order to collect accurate data. 3. Select your sample and start your survey: It can sometimes be challenging to find the right sample and determine a suitable sample frame. Even if all factors are in your favor, there still might be unforeseen issues like cost factor, quality of respondents and quickness to respond. Getting a sample to respond to true probability survey might be difficult but not impossible. But, in most cases, drawing a probability sample will save you time, money, and a lot of frustration. You probably can’t send surveys to everyone but you can always give everyone a chance to participate, this is what probability sample is all about. When to use Probability Sampling 1. When the sampling bias has to be reduced: This sampling method is used when the bias has to be minimum. The selection of the sample largely determines the quality of the research’s inference. How researchers select their sample largely determines the quality of a researcher’s findings. Probability sampling leads to higher quality findings because it provides an unbiased representation of the population. 2. When the population is usually diverse: When your population size is large and diverse this sampling method is usually used extensively as probability sampling helps researchers create samples that fully represent the population. Say we want to find out how many people prefer medical tourism over getting treated in their own country, this sampling method will help pick samples from various socio-economic strata, background etc to represent the bigger population. 3. To create an accurate sample: Probability sampling help researchers create an accurate sample of their population. Researchers can use proven statistical methods to draw accurate sample size to obtained well-defined data. Advantages 1. Its Cost-effective: This process is both cost and time effective and a larger sample can also be chosen based on numbers assigned to the samples and then choosing random numbers from the bigger sample. Work here is done. 2. It is simple and easy: Probability sampling is an easy way of sampling as it does not involve a complicated process. It is quick and saves time. The time saved can thus be used to analyze the data and draw conclusions. 3. It is non-technical: This method of sampling doesn’t require any technical knowledge because of the simplicity with which this can be done. This method doesn’t require complex knowledge and it is not at all lengthy. Types of Probability Sampling: Simple Random Sampling, Systematic Sampling, Stratified Random sampling, Area sampling, Cluster Sampling Types of Probability Sampling 1. Simple Random Sample Simple random sampling as the name suggests is a completely random method of selecting the sample. This sampling method is as easy as assigning numbers to the individuals (sample) and then randomly choosing from those numbers through an automated process. Finally, the numbers that are chosen are the members that are included in the sample. There are two ways in which the samples are chosen in this method of sampling: Lottery system and using number generating software/ random number table. This sampling technique usually works around large population and has its fair share of advantages and disadvantages. Simple Random Sample Advantages Ease of use represents the biggest advantage of simple random sampling. Unlike more complicated sampling methods such as stratified random sampling and probability sampling, no need exists to divide the population into sub-populations or take any other additional steps before selecting members of the population at random. A simple random sample is meant to be an unbiased representation of a group. It is considered a fair way to select a sample from a larger population, since every member of the population has an equal chance of getting selected. Simple Random Sample Disadvantages A sampling error can occur with a simple random sample if the sample does not end up accurately reflecting the population it is supposed to represent. For example, in our simple random sample of 25 employees, it would be possible to draw 25 men even if the population consisted of 125 women and 125 men. For this reason, simple random sampling is more commonly used when the researcher knows little about the population. If the researcher knew more, it would be better to use a different sampling technique, such as stratified random sampling, which helps to account for the differences within the population, such as age, race or gender. Other disadvantages include the fact that for sampling from large populations, the process can be time consuming and costly compared to other methods. 2. Systematic Sample Systematic Sampling is when you choose every “nth” individual to be a part of the sample. For example, you can choose every 5th person to be in the sample. Systematic sampling is an extended implementation of the same old probability technique in which each member of the group is selected at regular periods to form a sample. There’s an equal opportunity for every member of a population to be selected using this sampling technique. Risks Associated With Systematic Sampling One risk that statisticians must consider when conducting systematic sampling involves how the list used with the sampling interval is organized. If the population placed on the list is organized in a cyclical pattern that matches the sampling interval, the selected sample may be biased. For example, a company’s human resources department wants to pick a sample of employees and ask how they feel about company policies. Employees are grouped in teams of 20, with each team headed by a manager. If the list used to pick the sample size is organized with teams clustered together, the statistician risks picking only managers (or no managers at all) depending on the sampling interval. 3. Stratified Random Sample Stratified Random sampling involves a method where a larger population can be divided into smaller groups that usually don’t overlap but represent the entire population together. While sampling these groups can be organized and then draw a sample from each group separately. A common method is to arrange or classify by sex, age, ethnicity and similar ways. Splitting subjects into mutually exclusive groups and then using simple random sampling to choose members from groups. Members in each of these groups should be distinct so that every member of all groups get equal opportunity to be selected using simple probability. This sampling method is also called “random quota sampling. Advantages of Stratified Random Sampling The main advantage of stratified random sampling is that it captures key population characteristics in the sample. Similar to a weighted average, this method of sampling produces characteristics in the sample that are proportional to the overall population. Stratified random sampling works well for populations with a variety of attributes but is otherwise ineffective if subgroups cannot be formed. Stratification gives a smaller error in estimation and greater precision than the simple random sampling method. The greater the differences between the strata, the greater the gain in precision. 4. Area Sampling Area sampling is a method of sampling used when no complete frame of reference is available. The total area under investigation is divided into small sub-areas which are sampled at random or according to a restricted process (stratification of sampling). Each of the chosen sub-areas is then fully inspected and enumerated, and may form the basis for further sampling if desired. Application of Area sampling The basic idea of area sampling is both simple and powerful. It enjoys wide usage in situations where very high quality data are wanted but for which no list of universe items exists. For instance, many governmental agencies (e.g. Bureau of Labor Statistics) use area sampling. However, the practical execution of a large scale area sample is highly complex. Typically an area sampling is conducted in multiple stages, with successively smaller area clusters being sub-sampled at each stage. Example: A national sample of households is often constructed in a series of steps like this: (i) Create geographic strata, each consisting of a group of counties in more or less close proximity. Fifty or more such strata, containing all of the roughly 3,000 US counties, are commonly used. (ii) Within each geographic stratum, choose a probability sample of one or more counties (or groups of counties such as metropolitan areas). (iii) Within each sample county (or group of counties), choose a probability sample of places (cities, towns, etc). (iv) Within each sample place, select a probability sample of area segments (blocks in cities, area with identifiable boundaries in other places, etc) (v) Finally, within sample segments choose a probability sample of households. 5. Cluster Sampling Cluster sampling is a way to randomly select participants when they are geographically spread out. For example, if you wanted to choose 100 participants from the entire population of the U.S., it is likely impossible to get a complete list of everyone. Instead, the researcher randomly selects areas (i.e. cities or counties) and randomly selects from within those boundaries. Cluster sampling usually analyzes a particular population in which the sample consists of more than a few elements, for example, city, family, university etc. The clusters are then selected by dividing the greater population into various smaller sections. Cluster Sampling: Steps Some steps and tips to use cluster sampling for market research, are:- • Sample: Decide the target audience and also the size of the sample. • Create and evaluate sampling frames: Create a sampling frame by using either an existing frame or creating a new one for the target audience. Evaluate frames on the basis of coverage and clustering and make adjustments accordingly. These groups will be varied considering the population which can be exclusive and comprehensive. Members of a sample are selected individually. • Determine groups: Determine the number of groups by including the same average members in each group. Make sure each of these groups are distinct from one another. • Select clusters: Choose clusters randomly for sampling. • Geographic segmentation: Geographic segmentation is the most commonly used cluster sample. • Sub-types: Cluster sampling is bifurcated into one-stage and multi-stage subtypes on the basis of the number of steps followed by researchers to form clusters. Cluster Sampling Methods with Examples There are two ways to classify cluster sampling. The first way is based on the number of stages followed to obtain the cluster sample and the second way is the representation of the groups in the entire cluster. The first classification is the most used in cluster sampling. In most cases, sampling by clusters happens over multiple stages. A stage is considered to be the steps taken to get to a desired sample and cluster sampling is divided into single-stage, two-stage, and multiple stages. (I) Single Stage Cluster Sampling: As the name suggests, sampling will be done just once. An example of Single Stage Cluster Sampling –An NGO wants to create a sample of girls across 5 neighboring towns to provide education. Using single-stage cluster sampling, the NGO can randomly select towns (clusters) to form a sample and extend help to the girls deprived of education in those towns. (II) Two-Stage Cluster Sampling: A sample created using two-stages is always better than a sample created using a single stage because more filtered elements can be selected which can lead to improved results from the sample. In two-stage cluster sampling, instead of selecting all the elements of a cluster, only a handful of members are selected from each cluster by implementing systematic or simple random sampling. An example of Two-Stage Cluster Sampling –A business owner is inclined towards exploring the statistical performance of her plants which are spread across various parts of the U.S. Considering the number of plants, number of employees per plant and work done from each plant, single-stage sampling would be time and cost consuming. This is when she decides to conduct two-stage sampling. The owner creates samples of employees belonging to different plants to form clusters and then divides it into the size or operation status of the plant. A two-level cluster sampling was formed on which other clustering techniques like simple random sampling were applied to proceed with the calculations. (III) Multiple Stage Cluster Sampling: For effective research to be conducted across multiple geographies, one needs to form complicated clusters that can be achieved only using multiple-stage cluster sampling technique. Steps of listing and sampling will be used in this sampling method. An example of Multiple Stage Cluster Sampling –Geographic cluster sampling is one of the most extensively implemented cluster sampling technique. If an organization intends to conduct a survey to analyze the performance of smartphones across Germany. They can divide the entire country’s population into cities (clusters) and further select cities with the highest population and also filter those using mobile devices. Cluster Sampling Advantages There are multiple advantages of using cluster sampling, they are:- (I) Consumes less time and cost: Sampling of geographically divided groups require less work, time and cost. It’s a highly economical method to observe clusters instead of randomly doing it throughout a particular region by allocating a limited number of resources to those selected clusters. (II) Convenient access: Large samples can be chosen with this sampling technique and that’ll increase accessibility to various clusters. (III) Least loss in accuracy of data: Since there can be large samples in each cluster, loss of accuracy in information per individual can be compensated. (IV) Ease of implementation: Since cluster sampling facilitates information from various areas and groups, it can be easily implemented in practical situations in comparison to other probability sampling methods such as simple random sampling, systematic sampling, and stratified sampling or non-probability sampling methods such as convenience sampling. In comparison to simple random sampling, cluster sampling can be effective in deciding the characteristics of a group such as population and it can also be implemented without having a sampling frame for all the elements for the entire population. Non Probability Sample Non-probability sampling is a sampling technique in which the researcher selects samples based on the subjective judgment of the researcher rather than random selection. In non-probability sampling, not all members of the population have a chance of participating in the study unlike probability sampling, where each member of the population has a known chance of being selected. Non-probability sampling is most useful for exploratory studies like pilot survey (a survey that is deployed to a smaller sample compared to pre-determined sample size). Non-probability sampling is used in studies where it is not possible to draw random probability sampling due to time or cost considerations. Non-probability sampling is a less stringent method, this sampling method depends heavily on the expertise of the researchers. Non-probability sampling is carried out by methods of observation and is widely used in qualitative research. Advantages of non-probability sampling (i) Non-probability sampling is a more conducive and practical method for researchers deploying survey in the real world. Although statisticians prefer probability sampling because it yields data in the form of numbers. However, if done correctly, non-probability sampling can yield similar if not the same quality of results. (ii) Getting responses using non-probability sampling is faster and more cost-effective as compared to probability sampling because sample is known to researcher, they are motivated to respond quickly as compared to people who are randomly selected. Disadvantages of non-probability sampling (i) In non-probability sampling, researcher needs to think through potential reasons for biases. It is important to have a sample that represents closely the population. (ii) While choosing a sample in non-probability sampling, researchers need to be careful about recruits distorting data. At the end of the day, research is carried out to obtain meaningful insights and useful data. When to use non-probability sampling? • This type of sampling is used to indicate if a particular trait or characteristic exists in a population. • This sampling technique is widely used when researchers aim at conducting qualitative research, pilot studies or exploratory research. • Non-probability sampling is used when researchers have limited time to conduct researcher or have budget constraints. • Non-probability sampling is conducted to observe if a particular issue needs in-depth analysis. Types of Non-Probability Sampling: Judgmental or Purposive Sampling, Convenience Sampling, Quota Sampling, Snowball Sampling, Consecutive Sampling 1. JUDGMENT OR PURPOSIVE SAMPLING In judgmental sampling, the samples are selected based purely on researcher’s knowledge and credibility. In other words, researchers choose only those who he feels are a right fit (with respect to attributes and representation of a population) to participate in research study. This is not a scientific method of sampling and the downside to this sampling technique is that the results can be influenced by the preconceived notions of a researcher. Thus, there is a high amount of ambiguity involved in this research technique. For example, this type of sampling method can be used in pilot studies. 2. CONVENIENCE SAMPLING Convenience sampling is a non-probability sampling technique where samples are selected from the population only because they are conveniently available to researcher. These samples are selected only because they are easy to recruit and researcher did not consider selecting sample that represents the entire population. Ideally, in research, it is good to test sample that represents the population. But, in some research, the population is too large to test and consider the entire population. This is one of the reasons, why researchers rely on convenience sampling, which is the most common non-probability sampling technique, because of its speed, cost-effectiveness, and ease of availability of the sample. An example of convenience sampling would be using student volunteers known to researcher. Researcher can send the survey to students and they would act as sample in this situation. 3. Quota Sampling Hypothetically consider, a researcher wants to study the career goals of male and female employees in an organization. There are 500 employees in the organization. These 500 employees are known as population. In order to understand better about a population, researcher will need only a sample, not the entire population. Further, researcher is interested in particular strata within the population. Here is where quota sampling helps in dividing the population into strata or groups. For studying the career goals of 500 employees, technically the sample selected should have proportionate numbers of males and females. Which means there should be 250 males and 250 females. Since, this is unlikely, the groups or strata is selected using quota sampling. 4. Snowball Sampling Snowball sampling helps researchers find sample when they are difficult to locate. Researchers use this technique when the sample size is small and not easily available. This sampling system works like the referral program. Once the researchers find suitable subjects, they are asked for assistance to seek similar subjects to form a considerably good size sample. For example, this type of sampling can be used to conduct research involving a particular illness in patients or a rare disease. Researchers can seek help from subjects to refer other subjects suffering from the same ailment to form a subjective sample to carry out the study. 5. Consecutive Sampling This non-probability sampling technique is very similar to convenience sampling, with a slight variation. Here, the researcher picks a single person or a group of sample, conducts research over a period of time, analyzes the results and then moves on to another subject or group of subject if needed. Consecutive sampling gives the researcher a chance to work with many subjects and fine tune his/her research by collecting results that have vital insights. Determining Size of the Sample: Practical Considerations in Sampling and Sample Size Determining sample size is a very important issue because samples that are too large may waste time, resources and money, while samples that are too small may lead to inaccurate results. In many cases, we can easily determine the minimum sample size needed to estimate a process parameter, such as the population mean. Sample size determination is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is determined based on the expense of data collection, and the need to have sufficient statistical power. In complicated studies there may be several different sample sizes involved in the study: for example, in a stratified survey there would be different sample sizes for each stratum. In a census, data are collected on the entire population, hence the sample size is equal to the population size. In experimental design, where a study may be divided into different treatment groups, this may be different sample sizes for each group. Sample sizes may be chosen in several different ways: • Experience – A choice of small sample sizes, though sometimes necessary, can result in wide confidence intervals or risks of errors in statistical hypothesis testing. • Using a target variance for an estimate to be derived from the sample eventually obtained, i.e. if a high precision is required (narrow confidence interval) this translates to a low target variance of the estimator. • Using a target for the power of a statistical test to be applied once the sample is collected. • Using a confidence level, i.e. the larger the required confidence level, the larger the sample size (given a constant precision requirement). When sample data is collected and the sample mean is calculated, that sample mean is typically different from the population mean (µ) . This difference between the sample and population means can be thought of as an error. The margin of error is the maximum difference between the observed sample mean and the true value of the population mean (µ) : where: is known as the critical value, the positive Ζ value that is at the vertical boundary for the area of in the right tail of the standard normal distribution. σ is the population standard deviation. n is the sample size. Rearranging this formula, we can solve for the sample size necessary to produce results accurate to a specified confidence and margin of error. This formula can be used when you know and want to determine the sample size necessary to establish, with a confidence of , the mean value to within You can still use this formula if you don’t know your population standard deviation and you have a small sample size. Although it’s unlikely that you know when the population mean is not known, you may be able to determine from a similar process or from a pilot test/simulation. Unit 5 Data Analysis Data Analysis: Editing, Coding, Tabular Representation of Data Data Analysis is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, while being used in different business, science, and social science domains. In today’s business, data analysis is playing a role in making decisions more scientific and helping the business achieve effective operation. EDITING EDITING is the process of checking and adjusting responses in the completed questionnaires for omissions, legibility, and consistency and readying them for coding and storage. Purpose of Editing Purpose of Editing For consistency between and among responses. For completeness in responses– to reduce effects of item non-response. To better utilize questions answered out of order. To facilitate the coding process. Basic Principles of Editing 1. Checking of the no. of Schedules / Questionnaire) 2. Completeness (Completed in filling of questions) 3. Legibility. 4. To avoid Inconstancies in answers. 5. To Maintain Degree of Uniformity. 6. To Eliminate Irrelevant Responses. Types of Editing 1. Field Editing Preliminary editing by a field supervisor on the same day as the interview to catch technical omissions, check legibility of handwriting, and clarify responses that are logically or conceptually inconsistent. 2. Office Editing Editing performed by a central office staff; often done more rigorously than field editing. CODING The process of identifying and classifying each answer with a numerical score or other character symbol. The numerical score or symbol is called a code, and serves as a rule for interpreting, classifying, and recording data. Identifying responses with codes is necessary if data is to be processed by computer. Coded data is often stored electronically in the form of a data matrix – a rectangular arrangement of the data into rows (representing cases) and columns (representing variables) The data matrix is organized into fields, records, and files: Field: A collection of characters that represents a single type of data. Record: A collection of related fields, i.e., fields related to the same case (or respondent). File: A collection of related records, i.e. records related to the same sample. Tabular Representation of Data Presentation of data is of utter importance nowadays. After all everything that’s pleasing to our eyes never fails to grab our attention. Presentation of data refers to an exhibition or putting up data in an attractive and useful manner such that it can be easily interpreted. Tabular Representation A table facilitates representation of even large amounts of data in an attractive, easy to read and organized manner. The data is organized in rows and columns. This is one of the most widely used forms of presentation of data since data tables are easy to construct and read. Components of Data Tables • Table Number: Each table should have a specific table number for ease of access and locating. This number can be readily mentioned anywhere which serves as a reference and leads us directly to the data mentioned in that particular table. • Title: A table must contain a title that clearly tells the readers about the data it contains, time period of study, place of study and the nature of classification of data. • Headnotes: A headnote further aids in the purpose of a title and displays more information about the table. Generally, headnotes present the units of data in brackets at the end of a table title. • Stubs: These are titles of the rows in a table. Thus a stub display information about the data contained in a particular row. • Caption: A caption is the title of a column in the data table. In fact, it is a counterpart if a stub and indicates the information contained in a column. • Body or field: The body of a table is the content of a table in its entirety. Each item in a body is known as a ‘cell’. • Footnotes: Footnotes are rarely used. In effect, they supplement the title of a table if required. • Source: When using data obtained from a secondary source, this source has to be mentioned below the footnote. Construction of Data Tables There are many ways for construction of a good table. However, some basic ideas are: • The title should be in accordance with the objective of study: The title of a table should provide a quick insight into the table. • Comparison: If there might arise a need to compare any two rows or columns then these might be kept close to each other. • Alternative location of stubs: If the rows in a data table are lengthy, then the stubs can be placed on the right-hand side of the table. • Headings: Headings should be written in a singular form. For example, ‘good’ must be used instead of ‘goods’. • Footnote: A footnote should be given only if needed. • Size of columns: Size of columns must be uniform and symmetrical. • Use of abbreviations: Headings and sub-headings should be free of abbreviations. • Units: There should be a clear specification of units above the columns. The Advantages of Tabular Representation • Ease of representation: A large amount of data can be easily confined in a data table. Evidently, it is the simplest form of data presentation. • Ease of analysis: Data tables are frequently used for statistical analysis like calculation of central tendency, dispersion etc. • Helps in comparison: In a data table, the rows and columns which are required to be compared can be placed next to each other. To point out, this facilitates comparison as it becomes easy to compare each value. • Economical: Construction of a data table is fairly easy and presents the data in a manner which is really easy on the eyes of a reader. Moreover, it saves time as well as space. Frequency Tables, Construct a Frequency Distribution An important branch of mathematics that deals with gathering, organizing, estimating and interpreting the vast numerical data for a survey or a research, is known as statistics. There may be one or more numbers of statistical data that are used more than once. The number of times a particular data item is utilized, is known as its frequency. When the distribution of frequencies is listed in a table OR tabular presentation of frequency distribution, known as frequency table. It is used to list out one or more variables taken in a sample. Each sample contains an individual frequency and each frequency is distributed with an interval between each frequency. It is also of two types that is univariate and joint. Frequency distribution can be defined as a summary presentation of the number of observations of an attribute or values of a variable arranged according to their magnitudes either individually in the case of discrete series or in a range or class interval in the case of both discrete and continuing series. Frequency Table Frequency Distribution Table is a way to organize data. A frequency distribution table is an organized tabulation of the number of individual events located in each category. It contains at least two columns, one for the score categories (X) and another for the frequencies (f). Below we have explained briefly for you to understand the concept of frequency table better and workout frequency table example: Solved Example Question: Here is the list of marks obtained for the students in the examination. Find the number of students who got more than 85 marks, More than 95, Less than 80 more than 76. Solution: From the table we can conclude that: Students who got more than 85 = 8 + 5 + 1 = 14 Students who got more than 95 = 1 Students who got less than 80 more than 76 = 14. Construction of Frequency Distribution The following steps are involved in the construction of a frequency distribution. (1) Find the range of the data: The range is the difference between the largest and the smallest values. (2) Decide the approximate number of classes in which the data are to be grouped. There are no hard and first rules for number of classes. In most cases we have 5 to 20 classes. H.A. Sturges provides a formula for determining the approximation number of classes. K=1+3.322logN where K= Number of classes and logN = Logarithm of the total number of observations. Example: If the total number of observations is 50, the number of classes would be K=1+3.322logN K=1+3.322log50 K=1+3.322(1.69897) K=1+5.644 K=6.644 7 classes, approximately. (3) Determine the approximate class interval size: The size of class interval is obtained by dividing the range of data by the number of classes and is denoted by h class interval size h = Range Number of Classes In the case of fractional results, the next higher whole number is taken as the size of the class interval. (4) Decide the starting point: The lower class limit or class boundary should cover the smallest value in the raw data. It is a multiple of class intervals. Example: 0,5,10,15,20, etc. are commonly used. (5) Determine the remaining class limits (boundary): When the lowest class boundary has been decided, by adding the class interval size to the lower class boundary you can compute the upper class boundary. The remaining lower and upper class limits may be determined by adding the class interval size repeatedly till the largest value of the data is observed in the class. (6) Distribute the data into respective classes: All the observations are divided into respective classes by using the tally bar (tally mark) method, which is suitable for tabulating the observations into respective classes. The number of tally bars is counted to get the frequency against each class. The frequency of all the classes is noted to get the grouped data or frequency distribution of the data. The total of the frequency columns must be equal to the number of observations. Graphical Representation of Data: Appropriate Usages of Bar Chart, Pie Charts, Histogram Graphic representation is another way of analyzing numerical data. A graph is a sort of chart through which statistical data are represented in the form of lines or curves drawn across the coordinated points plotted on its surface. Graphs enable us in studying the cause and effect relationship between two variables. Graphs help to measure the extent of change in one variable when another variable changes by a certain amount. Graphs also enable us in studying both time series and frequency distribution as they give clear account and precise picture of problem. Graphs are also easy to understand and eye catching. General Principles of Graphic Representation There are some algebraic principles which apply to all types of graphic representation of data. In a graph there are two lines called coordinate axes. One is vertical known as Y axis and the other is horizontal called X axis. These two lines are perpendicular to each other. Where these two lines intersect each other is called ‘0’ or the Origin. On the X axis the distances right to the origin have positive value (see fig. 7.1) and distances left to the origin have negative value. On the Y axis distances above the origin have a positive value and below the origin have a negative value. BAR CHART A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally. A vertical bar chart is sometimes called a line graph. A bar graph shows comparisons among discrete categories. One axis of the chart shows the specific categories being compared, and the other axis represents a measured value. Some bar graphs present bars clustered in groups of more than one, showing the values of more than one measured variable. A vertical bar graph is shown below: Number of students went to different states for study: The rectangular bars are separated by some distance in order to distinguish them from one another. The bar graph shows comparison among the given categories. Mostly, horizontal axis of the graph represents specific categories and vertical axis shows the discrete numerical values. PIE CHART A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice (and consequently its central angle and area), is proportional to the quantity it represents. While it is named for its resemblance to a pie which has been sliced, there are variations on the way it can be presented. The earliest known pie chart is generally credited to William Playfair’s Statistical Breviary of 1801. Pie charts are very widely used in the business world and the mass media. However, they have been criticized, and many experts recommend avoiding them, pointing out that research has shown it is difficult to compare different sections of a given pie chart, or to compare data across different pie charts. Pie charts can be replaced in most cases by other plots such as the bar chart, box plot or dot plots. Fig. – Pie chart of populations of English native speakers HISTOGRAM Histogram is a non-cumulative frequency graph, it is drawn on a natural scale in which the representative frequencies of the different class of values are represented through vertical rectangles drawn closed to each other. Measure of central tendency, mode can be easily determined with the help of this graph. How to draw a Histogram? Step—1 Represent the class intervals of the variables along the X axis and their frequencies along the Y-axis on natural scale. Step—2 Start X axis with the lower limit of the lowest class interval. When the lower limit happens to be a distant score from the origin give a break in the X-axis n to indicate that the vertical axis has been moved in for convenience. Step—3 Now draw rectangular bars in parallel to Y axis above each of the class intervals with class units as base: The areas of rectangles must be proportional to the frequencies of the cor-responding classes. Solution In this graph we shall take class intervals in the X axis and frequencies in the Y axis. Before plotting the graph we have to convert the class into their exact limits. Advantages of histogram 1. It is easy to draw and simple to understand. 2. It helps us to understand the distribution easily and quickly. 3. It is more precise than the polygene. Limitations of histogram 1. It is not possible to plot more than one distribution on same axes as histogram. 2. Comparison of more than one frequency distribution on the same axes is not possible. 3. It is not possible to make it smooth. Uses of histogram 1. Represents the data in graphic form. 2. Provides the knowledge of how the scores in the group are distributed. Whether the scores are piled up at the lower or higher end of the distribution or are evenly and regularly distributed throughout the scale. 3. Frequency Polygon. The frequency polygon is a frequen¬cy graph which is drawn by joining the coordinating points of the mid-values of the class intervals and their corresponding fre-quencies. Hypothesis: Framing Null Hypothesis and Alternative Hypothesis Hypothesis A hypothesis (plural: hypotheses), in a scientific context, is a testable statement about the relationship between two or more variables or a proposed explanation for some observed phenomenon. In a scientific experiment or study, the hypothesis is a brief summation of the researcher’s prediction of the study’s findings, which may be supported or not by the outcome. Hypothesis testing is the core of the scientific method. The researcher’s prediction is usually referred to as the alternative hypothesis, and any other outcome as the null hypothesis — basically, the opposite outcome to what is predicted. (However, the terms are reversed if the researchers are predicting no difference or change, hypothesizing, for example, that the incidence of one variable will not increase or decrease in tandem with the other.) The null hypothesis satisfies the requirement for falsifiability: the capacity for a proposition to be proven false, which some schools of thought consider essential to the scientific method. According to others, however, testability is adequate, on the grounds that if there is sufficient support for a hypothesis it is not necessary to be able to conceive of a contrary outcome. Framing Null Hypothesis The null hypothesis is a general statement or default position that there is no relationship between two measured phenomena, or no association among groups. Testing (accepting, approving, rejecting, or disproving) the null hypothesis—and thus concluding that there are or are not grounds for believing that there is a relationship between two phenomena (e.g. that a potential treatment has a measurable effect)—is a central task in the modern practice of science; the field of statistics gives precise criteria for rejecting a null hypothesis. A null hypothesis is a precise statement about a population that we try to reject with sample data. We don’t usually believe our null hypothesis (or H0) to be true. However, we need some exact statement as a starting point for statistical significance testing. Null Hypothesis Examples Often -but not always- the null hypothesis states there is no association or difference between variables or subpopulations. Like so, some typical null hypotheses are: • The correlation between frustration and aggression is zero (correlation-analysis); • The average income for men is similar to that for women (independent samples t-test); • Nationality is (perfectly) unrelated to music preference (chi-square independence test); • The average population income was equal over 2012 through 2016 (repeated measures ANOVA). “Null” Does Not Mean “Zero” A common misunderstanding is that “null” implies “zero”. This is often but not always the case. For example, a null hypothesis may also state that The correlation between frustration and aggresion is 0.5. No zero involved here and -although somewhat unusual- perfectly valid. The “null” in “null hypothesis” derives from “nullify”: the null hypothesis is the statement that we’re trying to refute, regardless whether it does (not) specify a zero effect. Null Hypothesis – Limitations Thus far, we only concluded that the population correlation is probably not zero. That’s the only conclusion from our null hypothesis approach and it’s not really that interesting. What we really want to know is the population correlation. Our sample correlation of 0.25 seems a reasonable estimate. We call such a single number a point estimate. Now, a new sample may come up with a different correlation. An interesting question is how much our sample correlations would fluctuate over samples if we’d draw many of them. The figure below shows precisely that, assuming our sample size of N = 100 and our (point) estimate of 0.25 for the population correlation. Framing Alternative Hypothesis An alternative hypothesis is one in which a difference (or an effect) between two or more variables is anticipated by the researchers; that is, the observed pattern of the data is not due to a chance occurrence. This follows from the tenets of science, in which empirical evidence must be found to refute the null hypothesis before one can claim support for an alternative hypothesis (i.e. there is in fact a reliable difference or effect in whatever is being studied). The concept of the alternative hypothesis is a central part of formal hypothesis testing. An alternative hypothesis states that there is statistical significance between two variables. In the earlier example, the two variables are Mentos and Diet Coke. The alternative hypothesis is the hypothesis that the researcher is trying to prove. In the Mentos and Diet Coke experiment, Arnold was trying to prove that the Diet Coke would explode if he put Mentos in the bottle. Therefore, he proved his alternative hypothesis was correct. The alternative hypothesis is generally denoted as H1. It makes a statement that suggests or advises a potential result or an outcome that an investigator or the researcher may expect. It has been categorized into two categories: directional alternative hypothesis and non-directional alternative hypothesis. Key Differences between Null and Alternative Hypothesis The important points of differences between null and alternative hypothesis are explained as under:- 1. A null hypothesis is a statement, in which there is no relationship between two variables. An alternative hypothesis is a statement; that is simply the inverse of the null hypothesis, i.e. there is some statistical significance between two measured phenomenon. 2. A null hypothesis is what, the researcher tries to disprove whereas an alternative hypothesis is what the researcher wants to prove. 3. A null hypothesis represents, no observed effect whereas an alternative hypothesis reflects, some observed effect. 4. If the null hypothesis is accepted, no changes will be made in the opinions or actions. Conversely, if the alternative hypothesis is accepted, it will result in the changes in the opinions or actions. 5. As null hypothesis refers to population parameter, the testing is indirect and implicit. On the other hand, the alternative hypothesis indicates sample statistic, wherein, the testing is direct and explicit. 6. A null hypothesis is labelled as H0 (H-zero) while an alternative hypothesis is represented by H1 (H-one). 7. The mathematical formulation of a null hypothesis is an equal sign but for an alternative hypothesis is not equal to sign. 8. In null hypothesis, the observations are the outcome of chance whereas, in the case of the alternative hypothesis, the observations are an outcome of real effect. Conclusion There are two outcomes of a statistical test, i.e. first, a null hypothesis is rejected and alternative hypothesis is accepted, second, null hypothesis is accepted, on the basis of the evidence. In simple terms, a null hypothesis is just opposite of alternative hypothesis. Concept of Hypothesis Testing: Logic and Importance Hypothesis Testing Hypothesis testing was introduced by Ronald Fisher, Jerzy Neyman, Karl Pearson and Pearson’s son, Egon Pearson. Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data. Hypothesis Testing is basically an assumption that we make about the population parameter. Hypothesis Testing is done to help determine if the variation between or among groups of data is due to true variation or if it is the result of sample variation. With the help of sample data we form assumptions about the population, then we have to test our assumptions statistically. This is called Hypothesis testing. Key terms and concepts: (i) Null hypothesis: Null hypothesis is a statistical hypothesis that assumes that the observation is due to a chance factor. Null hypothesis is denoted by; H0: μ1 = μ2, which shows that there is no difference between the two population means. (ii) Alternative hypothesis: Contrary to the null hypothesis, the alternative hypothesis shows that observations are the result of a real effect. (iii) Level of significance: Refers to the degree of significance in which we accept or reject the null-hypothesis. 100% accuracy is not possible for accepting or rejecting a hypothesis, so we therefore select a level of significance that is usually 5%. (iv) Type I error: When we reject the null hypothesis, although that hypothesis was true. Type I error is denoted by alpha. In hypothesis testing, the normal curve that shows the critical region is called the alpha region. (v) Type II errors: When we accept the null hypothesis but it is false. Type II errors are denoted by beta. In Hypothesis testing, the normal curve that shows the acceptance region is called the beta region. (vi) Power: Usually known as the probability of correctly accepting the null hypothesis. 1-beta is called power of the analysis. (vii) One-tailed test: When the given statistical hypothesis is one value like H0: μ1 = μ2, it is called the one-tailed test. (viii) Two-tailed test: When the given statistics hypothesis assumes a less than or greater than value, it is called the two-tailed test. Importance of Hypothesis Testing Hypothesis testing is one of the most important concepts in statistics because it is how you decide if something really happened, or if certain treatments have positive effects, or if groups differ from each other or if one variable predicts another. In short, you want to proof if your data is statistically significant and unlikely to have occurred by chance alone. In essence then, a hypothesis test is a test of significance. Possible Conclusions Once the statistics are collected and you test your hypothesis against the likelihood of chance, you draw your final conclusion. If you reject the null hypothesis, you are claiming that your result is statistically significant and that it did not happen by luck or chance. As such, the outcome proves the alternative hypothesis. If you fail to reject the null hypothesis, you must conclude that you did not find an effect or difference in your study. This method is how many pharmaceutical drugs and medical procedures are tested. Tests of Significance: Small Sample Test Tests of Significance Once sample data has been gathered through an observational study or experiment, statistical inference allows analysts to assess evidence in favor or some claim about the population from which the sample has been drawn. The methods of inference used to support or reject claims based on sample data are known as tests of significance. Every test of significance begins with a null hypothesis H0. H0 represents a theory that has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. We would write H0: there is no difference between the two drugs on average. The alternative hypothesis, Ha, is a statement of what a statistical hypothesis test is set up to establish. For example, in a clinical trial of a new drug, the alternative hypothesis might be that the new drug has a different effect, on average, compared to that of the current drug. We would write Ha: the two drugs have different effects, on average. The alternative hypothesis might also be that the new drug is better, on average, than the current drug. In this case we would write Ha: the new drug is better than the current drug, on average. The final conclusion once the test has been carried out is always given in terms of the null hypothesis. We either “reject H0 in favor of Ha” or “do not reject H0“; we never conclude “reject Ha“, or even “accept Ha“. If we conclude “do not reject H0“, this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence against H0 in favor of Ha; rejecting the null hypothesis then, suggests that the alternative hypothesis may be true. Hypotheses are always stated in terms of population parameter, such as the mean . An alternative hypothesis may be one-sided or two-sided. A one-sided hypothesis claims that a parameter is either larger or smaller than the value given by the null hypothesis. A two-sided hypothesis claims that a parameter is simply not equal to the value given by the null hypothesis — the direction does not matter. The approach described in this lesson is appropriate, as long as the sample includes at least one success and one failure. The key steps are: • Formulate the hypotheses to be tested. This means stating the null hypothesis and the alternative hypothesis. • Determine the sampling distribution of the proportion. If the sample proportion is the outcome of a binomial experiment, the sampling distribution will be binomial. If it is the outcome of a hypergeometric experiment, the sampling distribution will be hypergeometric. • Specify the significance level. (Researchers often set the significance level equal to 0.05 or 0.01, although other values may be used.) • Based on the hypotheses, the sampling distribution, and the significance level, define the region of acceptance. • Test the null hypothesis. If the sample proportion falls within the region of acceptance, do not reject the null hypothesis; otherwise, reject the null hypothesis. The following examples illustrate how to test hypotheses with small samples. The first example involves a binomial experiment; and the second example, a hypergeometric experiment. Example 1: Sampling With Replacement Suppose an urn contains 30 marbles. Some marbles are red, and the rest are green. A researcher hypothesizes that the urn contains 15 or more red marbles. The researcher randomly samples five marbles, with replacement, from the urn. Two of the selected marbles are red, and three are green. Based on the sample results, should the researcher reject the null hypothesis? Use a significance level of 0.20. Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below: (I) Formulate hypotheses: The first step is to state the null hypothesis and an alternative hypothesis. Null hypothesis: P >= 0.50 Alternative hypothesis: P < 0.50 Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small. (II) Determine sampling distribution: Since we sampled with replacement, the sample proportion can be considered an outcome of a binomial experiment. And based on the null hypothesis, we assume that at least 15 of 30 marbles are red. Thus, the true population proportion is assumed to be 15/30 or 0.50. Given those inputs (a binomial distribution where the true population proportion is equal to 0.50), the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute binomial probabilities that form the body of the table.) (III) Specify significance level: The significance level was set at 0.20. (This means that the probability of making a Type I error is 0.20, assuming that the null hypothesis is true.) (IV) Define the region of acceptance: From the sampling distribution (see above table), we see that it is not possible to define a region of acceptance for which the significance level is exactly 0.20. However, we can define a region of acceptance for which the significance level would be no more than 0.20. From the table, we see that if the true population proportion is equal to 0.50, we would be very unlikely to pick 0 or 1 red marble in our sample of 5 marbles. The probability of selecting 1 or 0 red marbles would be 0.1875. Therefore, if we let the significance level equal 0.1875, we can define the region of rejection as any sampled outcome that includes only 0 or 1 red marble (i.e., a sampled proportion equal to 0 or 0.20). We can define the region of acceptance as any sampled outcome that includes at least 2 red marbles. This is equivalent to a sampled proportion that is greater than or equal to 0.40. (V) Test the null hypothesis: Since the sample proportion (0.40) is within the region of acceptance, we cannot reject the null hypothesis. Example 2: Sampling Without Replacement The Acme Advertising company has 25 clients. Account executives at Acme claim that 80 percent of these clients are very satisfied with the service they receive. To test that claim, Acme’s CEO commissions a survey of 10 clients. Survey participants are randomly sampled, without replacement, from the client population. Six of the ten sampled customers (i.e., 60 percent) say that they are very satisfied. Based on the sample results, should the CEO accept or reject the hypothesis that 80 percent of Acme’s clients are very satisfied. Use a significance level of 0.10. Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below: (I) Formulate hypotheses: The first step is to state the null hypothesis and an alternative hypothesis. Null hypothesis: P >= 0.80 Alternative hypothesis: P < 0.80 Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small. (II) Determine sampling distribution: Since we sampled without replacement, the sample proportion can be considered an outcome of a hypergeometric experiment. And based on the null hypothesis, we assume that at least 80 percent of the 25 clients (i.e. 20 clients) are very satisfied. Given those inputs (a hypergeometric distribution where 20 of 25 clients are very satisfied), the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute hypergeometric probabilities that form the body of the table.) (III) Specify significance level: The significance level was set at 0.10. (This means that the probability of making a Type I error is 0.10, assuming that the null hypothesis is true.) (IV) Define the region of acceptance: From the sampling distribution (see above table), we see that it is not possible to define a region of acceptance for which the significance level is exactly 0.10. However, we can define a region of acceptance for which the significance level would be no more than 0.10. From the table, we see that if the true proportion of very satisfied clients is equal to 0.80, we would be very unlikely to have fewer than 7 very satisfied clients in our sample. The probability of having 6 or fewer very satisfied clients in the sample would be 0.064. Therefore, if we let the significance level equal 0.064, we can define the region of rejection as any sampled outcome that includes 6 or fewer very satisfied customers. We can define the region of acceptance as any sampled outcome that includes 7 or more very satisfied customers. This is equivalent to a sample proportion that is greater than or equal to 0.70. (V) Test the null hypothesis: Since the sample proportion (0.60) is outside the region of acceptance, we cannot accept the null hypothesis at the 0.064 level of significance. T-Test (Mean, Proportion) The t test is one type of inferential statistics. It is used to determine whether there is a significant difference between the means of two groups. With all inferential statistics, we assume the dependent variable fits a normal distribution. When we assume a normal distribution exists, we can identify the probability of a particular outcome. We specify the level of probability (alpha level, level of significance, p) we are willing to accept before we collect data (p < .05 is a common value that is used). After we collect data we calculate a test statistic with a formula. We compare our test statistic with a critical value found on a table to see if our results fall within the acceptable level of probability. When the difference between two population averages is being investigated, a t test is used. In other words, a t test is used when we wish to compare two means (the scores must be measured on an interval or ratio measurement scale). We would use a t test if we wished to compare the reading achievement of boys and girls. With a t test, we have one independent variable and one dependent variable. The independent variable (gender in this case) can only have two levels (male and female). The dependent variable would be reading achievement. If the independent had more than two levels, then we would use a one-way analysis of variance (ANOVA). The test statistic that a t test produces is a t-value. Conceptually, t-values are an extension of z-scores. In a way, the t-value represents how many standard units the means of the two groups are apart. With a t test, the researcher wants to state with some degree of confidence that the obtained difference between the means of the sample groups is too great to be a chance event and that some difference also exists in the population from which the sample was drawn. In other words, the difference that we might find between the boys’ and girls’ reading achievement in our sample might have occurred by chance, or it might exist in the population. If our t test produces a t-value that results in a probability of .01, we say that the likelihood of getting the difference we found by chance would be 1 in a 100 times. We could say that it is unlikely that our results occurred by chance and the difference we found in the sample probably exists in the populations from which it was drawn. ASSUMPTIONS UNDERLYING THE T TEST • The samples have been randomly drawn from their respective populations • The scores in the population are normally distributed • The scores in the populations have the same variance (s1=s2) Note: We use a different calculation for the standard error if they are not. Procedure of Hypothesis Testing: Testing a hypothesis refers to verifying whether the hypothesis is valid or not. Hypothesis testing attempts to check whether to accept or not to accept the null hypothesis. The procedure of hypothesis testing includes all the steps that a researcher undertakes for making a choice between the two alternative actions of rejecting or accepting a null hypothesis. The various steps involved in hypothesis testing are as follows: 1) Making a Formal Statement: This step involves making a formal statement of the null hypothesis (H0) and the alternative hypothesis (Ha). This implies that the hypotheses should be clearly stated within the purview of the research problem. For example, suppose a school teacher wants to test the understanding capacity of the students which must be rated more than 90 per cent in terms of marks, the hypotheses may be stated as follows: Null Hypothesis H0 : = 100 Alternative Hypothesis H1 : > 100 2) Selecting a Significance Level: The hypotheses should be tested on a pre-determined level of significance, which should be specified. Usually, either 5% level or 1% level is considered for the purpose. The factors that determine the levels of significance are: (a) the magnitude of difference between the sample means; (b) the sample size: (c) the variability of measurements within samples; and (d) whether the hypothesis is directional or non-directional. In sum, the level of significance should be sufficient in the context of the nature and purpose of enquiry. 3) Deciding the Distribution to Use: After making decision on the level of significance for hypothesis testing, the researcher has to next determine the appropriate sampling distribution. The choice to be made generally relates to normal distribution and the t-distribution. The rules governing the selection of the correct distribution are similar to the ones already discussed with respect to estimation. 4) Selection of a Random Sample and Computing An Appropriate Value: Another step involved in hypothesis testing is the selection of a random sample and then computing a suitable value from the sample data relating to test statistic by using the appropriate distribution. In other words, it involves drawing a sample for furnishing empirical data. 5) Calculation of the Probability: The next step for the researcher is to calculate the probability that the sample result would diverge as far as it can from expectations, under the situation when the null hypothesis is actually true. 6) Comparing the Probability: Another step involved consists of making a comparison of the probability calculated with the specified value of α, i.e. The significance level. If the calculated probability works out to be equal to or smaller than the α value in case of one-tailed test, then the null hypothesis is to be rejected. On the other hand, if the calculated probability is greater, then the null hypothesis is to be accepted. In case the null hypothesis H0 is rejected, the researcher runs the risk of committing the Type I error. But, if the null hypothesis H0 is accepted, then it involves some risk (which cannot be specified in size as long as H0 is vague and not specific) of committing the Type II error. THREE TYPES OF T TESTS 1. Pair-difference t test (a.k.a. t-test for dependent groups, correlated t test) df= n (number of pairs) -1 This is concerned with the difference between the average scores of a single sample of individuals who are assessed at two different times (such as before treatment and after treatment). It can also compare average scores of samples of individuals who are paired in some way (such as siblings, mothers, daughters, persons who are matched in terms of a particular characteristics). 2. t test for Independent Samples (with two options) This is concerned with the difference between the averages of two populations. Basically, the procedure compares the averages of two samples that were selected independently of each other, and asks whether those sample averages differ enough to believe that the populations from which they were selected also have different averages. An example would be comparing math achievement scores of an experimental group with a control group. • Equal Variance (Pooled-variance t-test) df=n (total of both groups) -2 Note: Used when both samples have the same number of subject or when s1=s2 (Levene or F-max tests have p > .05). • Unequal Variance (Separate-variance t test) df dependents on a formula, but a rough estimate is one less than the smallest group Note: Used when the samples have different numbers of subjects and they have different variances — s1<>s2 (Levene or F-max tests have p < .05). HOW DO I DECIDE WHICH TYPE OF T TEST TO USE? F- Test, Z – Test AKTUTHEINTACTONE6 MAR 2019 F- TEST An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact “F-tests” mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s Assumptions of F- Test Several assumptions are made for the test. Your population must be approximately normally distributed (i.e. fit the shape of a bell curve) in order to use the test. Plus, the samples must be independent events. In addition, you’ll want to bear in mind a few important points:- • The larger variance should always go in the numerator (the top number) to force the test into a right-tailed test. Right-tailed tests are easier to calculate. • For two-tailed tests, divide alpha by 2 before finding the right critical value. • If you are given standard deviations, they must be squared to get the variances. • If your degrees of freedom aren’t listed in the F Table, use the larger critical value. This helps to avoid the possibility of Type I errors. Common examples Common examples of the use of F-tests include the study of the following cases: • The hypothesis that the means of a given set of normally distributed populations, all having the same standard deviation, are equal. This is perhaps the best-known F-test, and plays an important role in the analysis of variance (ANOVA). • The hypothesis that a proposed regression model fits the data well. See Lack-of-fit sum of squares. • The hypothesis that a data set in a regression analysis follows the simpler of two proposed linear models that are nested within each other. F Test to compare two variances by hand: Steps Warning: F tests can get really tedious to calculate by hand, especially if you have to calculate the variances. You’re much better off using technology (like Excel — see below). These are the general steps to follow. Scroll down for a specific example (watch the video underneath the steps). Step 1: If you are given standard deviations, go to Step 2. If you are given variances to compare, go to Step 3. Step 2: Square both standard deviations to get the variances. For example, if σ1 = 9.6 and σ2 = 10.9, then the variances (s1 and s2) would be 9.62 = 92.16 and 10.92 = 118.81. Step 3: Take the largest variance, and divide it by the smallest variance to get the f-value. For example, if your two variances were s1 = 2.5 and s2 = 9.4, divide 9.4 / 2.5 = 3.76. Why? Placing the largest variance on top will force the F-test into a right tailed test, which is much easier to calculate than a left-tailed test. Step 4: Find your degrees of freedom. Degrees of freedom is your sample size minus 1. As you have two samples (variance 1 and variance 2), you’ll have two degrees of freedom: one for the numerator and one for the denominator. Step 5: Look at the f-value you calculated in Step 3 in the f-table. Note that there are several tables, so you’ll need to locate the right table for your alpha level. Unsure how to read an f-table? Read What is an f-table?. Step 6: Compare your calculated value (Step 3) with the table f-value in Step 5. If the f-table value is smaller than the calculated value, you can reject the null hypothesis. Z-TEST A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Because of the central limit theorem, many test statistics are approximately normally distributed for large samples. For each significance level, the Z-test has a single critical value (for example, 1.96 for 5% two tailed) which makes it more convenient than the Student’s t-test which has separate critical values for each sample size. Therefore, many statistical tests can be conveniently performed as approximate Z-tests if the sample size is large or the population variance is known. If the population variance is unknown (and therefore has to be estimated from the sample itself) and the sample size is not large (n < 30), the Student’s t-test may be more appropriate. A one-sample location test, two-sample location test, paired difference test and maximum likelihood estimate are examples of tests that can be conducted as z-tests. Z-tests are closely related to t-tests, but t-tests are best performed when an experiment has a small sample size. Also, t-tests assume the standard deviation is unknown, while z-tests assume it is known. If the standard deviation of the population is unknown, the assumption of the sample variance equaling the population variance is made. One-Sample Z-Test Example For example, assume an investor wishes to test whether the average daily return of a stock is greater than 1%. A simple random sample of 50 returns is calculated and has an average of 2%. Assume the standard deviation of the returns is 2.50%. Therefore, the null hypothesis is when the average, or mean, is equal to 3%. Conversely, the alternative hypothesis is whether the mean return is greater than 3%. Assume an alpha of 0.05% is selected with a two-tailed test. Consequently, there is 0.025% of the samples in each tail, and the alpha has a critical value of 1.96 or -1.96. If the value of z is greater than 1.96 or less than -1.96, the null hypothesis is rejected. The value for z is calculated by subtracting the value of the average daily return selected for the test, or 1% in this case, from the observed average of the samples. Next, divide the resulting value by the standard deviation divided by the square root of the number of observed values. Therefore, the test statistic is calculated to be 2.83, or (0.02 – 0.01) / (0.025 / (50)^(1/2)). The investor rejects the null hypothesis since z is greater than 1.96, and concludes that the average daily return is greater than 1%. Cross Tabulation, Chi-Squared Test Cross Tabulation is a main frame statistical model which follows on similar lines, it help you take informed decision with regards to your research by identifying patterns, trends and correlation between parameters within your study. When conducting a study, the raw data can usually be daunting and will always points to several chaotic possible outcomes, in such situation cross-tab helps you zero in on a single theory beyond doubt by drawing trends, comparisons and correlations between factors that are mutually inclusive within your study. For example, consider your college application – you probably did not realize it at the time but you were mentally cross tabulating the factors involved to arrive at a conscious decision with respect to which colleges you wanted to attend and had the best shot at while applying. Let us go through your decision making process one factor at a time. First, you needed to look at the academic factor which were your grades throughout high school, SAT scores, the field you wanted to major in and the application essay you would need to write. Second, comes the financial factor which will look at the tuition fees and possibilities of a scholarship. Last, but definitely not the least, would be the emotional factor which will consider your distance from home and how far are the universities your friends are considering so reunions would not be an issue. In other words, cross tabulating Academics + Finance + Emotions led you to a refined list of universities one of which is or soon will be your Alma Mater. Cross tabulation also known as cross-tab or contingency table is a statistical tool that is used for categorical data. Categorical data involves values that are mutually exclusive to each other. Data is always collected in numbers, but numbers have no value unless they mean something. 4,7,9 are simply numerical unless until specified. For example, 4 apples, 7 bananas, and 9 kiwis. Cross tabulation is usually used to examine the relationship within the data that is not evident. It is quite useful in market research studies and in surveys. A cross tab report shows the connection between two or more question asked in the survey. Understanding Cross Tabulation with Example Cross-tab is a popular choice for statistical data analysis. Since it is a reporting/ analyzing tool it can used with any level of data: ordinal or nominal, because it treats all data as nominal data (nominal data is not measured it is categorized). Let’s say you can analyze the relation between two categorical variable like age and purchase of electronic gadgets. There are two questions asked here: (i) What is your age? (ii) What is the electronic gadget that you are likely to buy in the next 6 months? In this example you can see the distinctive connection between the age and the purchase of the electronic gadget. It is not surprising but certainly interesting to see the correlation between the two variables through the data collected. In survey research crosstab allows to deep dive and analyze the prospective data, making it simpler to spot trends and opportunities without getting overwhelmed with all the data gathered from the responses. Chi-Squared Test A chi-squared test, also written as χ2 test, is any statistical hypothesis test where the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Without other qualification, ‘chi-squared test’ often is used as short for Pearson’s chi-squared test. The chi-squared test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories. In the standard applications of the test, the observations are classified into mutually exclusive classes, and there is some theory, or say null hypothesis, which gives the probability that any observation falls into the corresponding class. The purpose of the test is to evaluate how likely the observations that are made would be, assuming the null hypothesis is true. Chi-squared tests are often constructed from a sum of squared errors, or through the sample variance. Test statistics that follow a chi-squared distribution arise from an assumption of independent normally distributed data, which is valid in many cases due to the central limit theorem. A chi-squared test can be used to attempt rejection of the null hypothesis that the data are independent. How to Calculate a Chi-square Statistics? The formula for calculating a Chi-square statistic is: Where, O stands for the observed frequency, E stands for the expected frequency. Expected count is subtracted from the observed count to find the difference between the two. Then the square of the difference is calculated to get rid of the negative vales (as the squares of 2 and −2 are, of course, both 4). Then the square of the difference is divided by the expected count to normalize bigger and smaller values (because we don’t want to get bigger Chi-square values just because we are working on large data sets). The sigma sign in front of them denotes that we have, to sum up, these values calculated for each cell. As an example, suppose we want to find out that whether there is an association between smoking and lung disease. The null and alternative hypothesis will be:- H 0 : There is no association between smoking and lung disease. H 1 : There is an association between smoking and lung disease. Analysis of Variance: One Way and Two Way Classifications Analysis of Variance Analysis of Variance (ANOVA) is a parametric statistical technique used to compare datasets. This technique was invented by R.A. Fisher, and is thus often referred to as Fisher’s ANOVA, as well. It is similar in application to techniques such as t-test and z-test, in that it is used to compare means and the relative variance between them. However, analysis of variance (ANOVA) is best applied where more than 2 populations or samples are meant to be compared. Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the “variation” among and between groups) used to analyze the differences among group means in a sample. ANOVA was developed by statistician and evolutionary biologist Ronald Fisher. In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether the population means of several groups are equal, and therefore generalizes the t-test to more than two groups. ANOVA is useful for comparing (testing) three or more group means for statistical significance. It is conceptually similar to multiple two-sample t-tests, but is more conservative, resulting in fewer type I errors, and is therefore suited to a wide range of practical problems. The Formula for ANOVA The following formula represents a one-way ANOVA test: ANOVA formula F= MST/MSE Where: F = ANOVA coefficient MST = Mean sum of squares due to treatment MSE = Mean sum of squares due to error. Example of How to Use ANOVA A researcher might, for example, test students from multiple colleges to see if students from one of the colleges consistently outperform students from the other schools. In a business application, an R&D researcher might test two different processes of creating a product to see if one process is better than the other in terms of cost efficiency. The type of ANOVA run depends on a number of factors. It is applied when data needs to be experimental. Analysis of variance is employed if there is no access to statistical software resulting in computing ANOVA by hand. It is simple to use and best suited for small samples. With many experimental designs, the sample sizes have to be the same for the various factor level combinations. Analysis of variances is helpful for testing three or more variables. It is similar to multiple two-sample t-tests. However, it results in fewer type I errors and is appropriate for a range of issues. ANOVA groups differences by comparing the means of each group, and includes spreading out the variance into diverse sources. It is employed with subjects, test groups, between groups and within groups. ONE-WAY ANOVA A one-way ANOVA is a type of statistical test that compares the variance in the group means within a sample whilst considering only one independent variable or factor. It is a hypothesis-based test, meaning that it aims to evaluate multiple mutually exclusive theories about our data. Before we can generate a hypothesis, we need to have a question about our data that we want an answer to. For example, adventurous researchers studying a population of walruses might ask “Do our walruses weigh more in early or late mating season?” Here, the independent variable or factor (the two terms mean the same thing) is “month of mating season”. In an ANOVA, our independent variables are organized in categorical groups. For example, if the researchers looked at walrus weight in December, January, February and March, there would be four months analyzed, and therefore four groups to the analysis. A one-way ANOVA compares three or more than three categorical groups to establish whether there is a difference between them. Within each group there should be three or more observations (here, this means walruses), and the means of the samples are compared. Hypotheses of One-Way ANOVA In a one-way ANOVA there are two possible hypotheses. • The null hypothesis (H0) is that there is no difference between the groups and equality between means. (Walruses weigh the same in different months) • The alternative hypothesis (H1) is that there is a difference between the means and groups. (Walruses have different weights in different months) Assumptions of One-Way ANOVA • Normality – That each sample is taken from a normally distributed population • Sample independence – that each sample has been drawn independently of the other samples • Variance Equality – That the variance of data in the different groups should be the same • Your dependent variable – here, “weight”, should be continuous – that is, measured on a scale which can be subdivided using increments (i.e. grams, milligrams) TWO-WAY ANOVA Two-way ANOVA is, like a one-way ANOVA, a hypothesis-based test. However, in the two-way ANOVA each sample is defined in two ways, and resultingly put into two categorical groups. Thinking again of our walruses, researchers might use a two-way ANOVA if their question is: “Are walruses heavier in early or late mating season and does that depend on the gender of the walrus?” In this example, both “month in mating season” and “gender of walrus” are factors – meaning in total, there are two factors. Once again, each factor’s number of groups must be considered – for “gender” there will only two groups “male” and “female”. The two-way ANOVA therefore examines the effect of two factors (month and gender) on a dependent variable – in this case weight, and also examines whether the two factors affect each other to influence the continuous variable. Assumptions of Two-Way ANOVA • Your dependent variable – Here, “weight”, should be continuous – that is, measured on a scale which can be subdivided using increments (i.e. grams, milligrams) • Your two independent variables – Here, “month” and “gender”, should be in categorical, independent groups. • Sample independence – That each sample has been drawn independently of the other samples. • Variance Equality – That the variance of data in the different groups should be the same. • Normality – That each sample is taken from a normally distributed population. Hypotheses of Two-Way ANOVA Because the two-way ANOVA consider the effect of two categorical factors, and the effect of the categorical factors on each other, there are three pairs of null or alternative hypotheses for the two-way ANOVA. Here, we present them for our walrus experiment, where month of mating season and gender are the two independent variables. H0: The means of all month groups are equal H1: The mean of at least one month group is different. H0: The means of the gender groups are equal. H1: The means of the gender groups are different. H0: There is no interaction between the month and gender. H1: There is interaction between the month and gender. Summary: Differences between One-Way and Two-Way ANOVA The key differences between one-way and two-way ANOVA are summarized clearly below. 1. A one-way ANOVA is primarily designed to enable the equality testing between three or more means. A two-way ANOVA is designed to assess the interrelationship of two independent variables on a dependent variable. 2. A one-way ANOVA only involves one factor or independent variable, whereas there are two independent variables in a two-way ANOVA. 3. In a one-way ANOVA, the one factor or independent variable analyzed has three or more categorical groups. A two-way ANOVA instead compares multiple groups of two factors. 4. One-way ANOVA need to satisfy only two principles of design of experiments, i.e. replication and randomization. As opposed to Two-way ANOVA, which meets all three principles of design of experiments which are replication, randomization, and local control. Mechanism of Report Writing AKTUTHEINTACTONE6 MAR 2019 2 COMMENTS There are very definite and set rules which should be followed in the actual preparation of the research report or paper. Once the techniques are finally decided, they should be scrupulously adhered to, and no deviation permitted. The criteria of format should be decided as soon as the materials for the research paper have been assembled. The following points deserve mention so far as the mechanics of writing a report are concerned: 1. Size and physical design The manuscript should be written on unruled paper 81/2× 11in size. If it is to be written by hand, then black or blue-black ink should be used. A margin of at least one and one-half inches should be allowed at the left hand and of at least half an inch at the right hand of the paper. There should also be one-inch margins, top and bottom. The paper should be neat and legible. If the manuscript is to be typed, then all typing should be double-spaced on one side of the page only except for the insertion of the long quotations. 2. Procedure Various steps in writing the report should be strictly adhered (All such steps have already been explained earlier in this chapter). 3. Layout Keeping in view the objective and nature of the problem, the layout of the report should be thought of and decided and accordingly adopted (The layout of the research report and various types of reports have been described in this chapter earlier which should be taken as a guide for report-writing in case of a particular problem). 4. Treatment of Quotations Quotations should be placed in quotation marks and double spaced, forming an immediate part of the text. But if a quotation is of a considerable length (more than four or five type written lines) then it should be single-spaced and indented at least half an inch to the right of the normal text margin. 5. The footnotes Regarding footnotes one should keep in view the followings: • The footnotes serve two purposes viz., the identification of materials used in quotations in the report and the notice of materials not immediately necessary to the body of the research text but still of supplemental value. In other words, footnotes are meant for cross references, citation of authorities and sources, acknowledgement and elucidation or explanation of a point of view. It should always be kept in view that footnote is not an end nor a means of the display of scholarship. The modern tendency is to make the minimum use of footnotes for scholarship does not need to be displayed. • Footnotes are placed at the bottom of the page on which the reference or quotation which they identify or supplement ends. Footnotes are customarily separated from the textual material by a space of half an inch and a line about one and a half inches long. • Footnotes should be numbered consecutively, usually beginning with 1 in each chapter separately. The number should be put slightly above the line, say at the end of a quotation. At the foot of the page, again, the footnote number should be indented and typed a little above the line. Thus, consecutive numbers must be used to correlate the reference in the text with its corresponding note at the bottom of the page, except in case of statistical tables and other numerical material, where symbols such as the asterisk (*) or the like one may be used to prevent confusion. • Footnotes are always typed in single space though they are divided from one another by double space. 6. Documentation Style Regarding documentation, the first footnote reference to any given work should be complete in its documentation, giving all the essential facts about the edition used. Such documentary footnotes follow a general sequence. The common order may be described as under: (i) Regarding the single-volume reference • Author’s name in normal order (and not beginning with the last name as in a bibliography) followed by a comma; • Title of work, underlined to indicate italics; • Place and date of publication; • Pagination references (The page number). Example John Gassner, Masters of the Drama, New York: Dover Publications, Inc. 1954, p. 315. (ii) Regarding multivolume reference • Author’s name in the normal order; • Title of work, underlined to indicate italics; • Place and date of publication; • Number of volume; • Pagination references (The page number). (iii) Regarding works arranged alphabetically For works arranged alphabetically such as encyclopedias and dictionaries, no pagination reference is usually needed. In such cases the order is illustrated as under: Example 1 “Salamanca,” Encyclopaedia Britannica, 14th Edition. Example 2 “Mary Wollstonecraft Godwin,” Dictionary of national biography. But if there should be a detailed reference to a long encyclopedia article, volume and pagination reference may be found necessary. (iv) Regarding periodicals reference • Name of the author in normal order; • Title of article, in quotation marks; • Name of periodical, underlined to indicate italics; • Volume number; • Date of issuance; • (v) Regarding anthologies and collections reference Quotations from anthologies or collections of literary works must be acknowledged not only by author, but also by the name of the collector. 7. Regarding Second-Hand Quotations Reference In such cases the documentation should be handled as follows: Original author and title; “quoted or cited in,”; Second author and work. Example J.F. Jones, Life in Ploynesia, p. 16, quoted in History of the Pacific Ocean area, by R.B. Abel, p. 191. 8. Case of Multiple Authorship If there are more than two authors or editors, then in the documentation the name of only the first given and the multiple authorship is indicated by “et al.” or “and others”. Subsequent references to the same work need not be so detailed as stated above. If the work is cited again without any other work intervening, it may be indicated as ibid, followed by a comma and the page number. A single page should be referred to as p., but more than one page be referred to as pp. If there are several pages referred to at a stretch, the practice is to use often the page number, for example, pp. 190ff, which means page number 190 and the following pages; but only for page 190 and the following page ‘190f’. Roman numerical is generally used to indicate the number of the volume of a book. Op. cit. (opera citato, in the work cited) or Loc. cit. (loco citato, in the place cited) are two of the very convenient abbreviations used in the footnotes. Op. cit. or Loc. cit. after the writer’s name would suggest that the reference is to work by the writer which has been cited in detail in an earlier footnote but intervened by some other references. 9. Punctuation and abbreviations in footnotes The first item after the number in the footnote is the author’s name, given in the normal signature order. This is followed by a comma. After the comma, the title of the book is given: the article (such as “A”, “An”, “The” etc.) is omitted and only the first word and proper nouns and adjectives are capitalized. The title is followed by a comma. Information concerning the edition is given next. This entry is followed by a comma. The place of publication is then stated; it may be mentioned in an abbreviated form, if the place happens to be a famous one such as Lond. for London, N.Y. for New York, N.D. for New Delhi and so on. This entry is followed by a comma. Then the name of the publisher is mentioned and this entry is closed by a comma. It is followed by the date of publication if the date is given on the title page. If the date appears in the copyright notice on the reverse side of the title page or elsewhere in the volume, the comma should be omitted and the date enclosed in square brackets [c 1978], [1978]. The entry is followed by a comma. Then follow the volume and page references and are separated by a comma if both are given. A period closes the complete documentary reference. But one should remember that the documentation regarding acknowledgements from magazine articles and periodical literature follow a different form as stated earlier while explaining the entries in the bibliography. 10. Use of statistics, charts and graphs A judicious use of statistics in research reports is often considered a virtue for it contributes a great deal towards the clarification and simplification of the material and research results. One may well remember that a good picture is often worth more than thousand words. Statistics are usually presented in the form of tables, charts, bars and line-graphs and pictograms. Such presentation should be self explanatory and complete in itself. It should be suitable and appropriate looking to the problem at hand. Finally, statistical presentation should be neat and attractive. 11. The final draft Revising and rewriting the rough draft of the report should be done with great care before writing the final draft. For the purpose, the researcher should put to himself questions like: Are the sentences written in the report clear? Are they grammatically correct? Do they say what is meant’? Do the various points incorporated in the report fit together logically? “Having at least one colleague read the report just before the final revision is extremely helpful. Sentences that seem crystal-clear to the writer may prove quite confusing to other people; a connection that had seemed self evident may strike others as a non-sequitur. A friendly critic, by pointing out passages that seem unclear or illogical, and perhaps suggesting ways of remedying the difficulties, can be an invaluable aid in achieving the goal of adequate communication. 12. Bibliography Bibliography should be prepared and appended to the research report as discussed earlier. 13. Preparation of the index At the end of the report, an index should invariably be given, the value of which lies in the fact that it acts as a good guide, to the reader. Index may be prepared both as subject index and as author index. The former gives the names of the subject-topics or concepts along with the number of pages on which they have appeared or discussed in the report, whereas the latter gives the similar information regarding the names of authors. The index should always be arranged alphabetically. Some people prefer to prepare only one index common for names of authors, subject-topics, concepts and the like ones. Report A report is the formal writing up of a project or a research investigation. A report has clearly defined sections presented in a standard format, which are used to tell the reader what you did, why and how you did it and what you found. Reports differ from essays because they require an objective writing style which conveys information clearly and concisely Structuring Your Report Most reports include the following sections: • Title • Abstract • Introduction • Method • Results • Discussion • Conclusions • References • Appendices 1. Title • This should be short and precise. It should tell the reader of the nature of your research. • Omit any unnecessary detail e.g. ‘A study of….’ is not necessary. 2. Abstract The Abstract is a self-contained summary of the whole of your report. It will therefore be written last and is usually limited to one paragraph. It should contain: • An outline of what you investigated (as stated in your title) • Why you chose to look at that particular area with brief reference to prior research done in the field • Your hypothesis (prediction of what the results will show) • A brief summary of your method • Your main findings and how these relate to your hypothesis • A conclusion which may include a suggestion for further research 3. Introduction The Introduction ‘sets the scene’ for your report; it does this in two ways: • By introducing the reader in more detail to the subject area you are looking at • Through presenting your objectives and hypotheses. Explain the background to the problem with reference to previous work conducted in the area (i.e. a literature review).Only include studies that have direct relevance to your research. Briefly discuss the findings of other researchers and how these connect with your study. Finally, state your aims or hypothesis. 4. Method The Method section should describe every step of how you carried out your research in sufficient detail so that the reader understands what you did. Information on your experimental design, sampling methods, participants, and the overall procedure employed should be clearly specified. This information is usually presented under the following sub-headings: • Objective • Design • Participants • Procedure(s) 5. Results Your Results section should clearly convey your findings. These are what you will base your commentary on in the Discussion section, so the reader needs to be certain of what you found. • Present data in a summarized form • Raw data Do not over-complicate the presentation and description of your results. Be clear and concise. • Describe what the results were, don’t offer interpretations of them • Present them in a logical order. • Those that link most directly to your hypothesis should be given first. Presenting Data in Tables and Graphs • Do not present the same data in two or more ways i.e. use either a table or a graph, or just text. • Remember that a graph should be understandable independently of any text, but you may accompany each with a description if necessary. • Use clear and concise titles for each figure. Say which variables the graph or table compares. • Describe what the graph or table shows, then check that this really is what it shows! If it isn’t, you need to amend your figure, or your description. Statistical Analysis: If you conducted a statistical analysis of your results: • Say which test you used • Show how your results were analysed, laying out your calculations clearly (ensure you include the level of probability or significance p or P, and the number of observations made n). • Clearly state the results of the analysis saying whether the result was statistically significant or not both as numbers and in words. 6. Discussion The Discussion section is the most important part of your report. It relates the findings of your study to the research that you talked about in your introduction, thereby placing your work in the wider context. The discussion helps the reader understand the relevance of your research to previous and further work in the field. This is your chance to discuss, analyse and interpret your results in relation to all the information you have collected. The Discussion will probably be the longest section of your report and should contain the following:- • A summary of the main results of your study • An interpretation of these results in relation to your aims, predictions or hypothesis, e.g. is your hypothesis supported or rejected?, and in relation to the findings of other research in the area • Consideration of the broader implications of your findings. What do they suggest for future research in the area? If your results contradict previous findings what does this suggest about your work or the work of others? What should be studied next? • A discussion of any limitations or problems with your research method or experimental design and practical suggestions of how these might be avoided if the study was conducted again • Some carefully considered ideas for further research in the area that would help clarify or take forward your own findings 7. Conclusions The Conclusion section briefly summaries the main issues arising from your report 8. References • Give details of work by all other authors which you have referred to in your report. • Check a style handbook or journal articles for variations in referencing styles. 9. Appendices The Appendices contain material that is relevant to your report but would disrupt its flow if it was contained within the main body. For example: raw data and calculations; interview questions; a glossary of terms, or other information that the reader may find useful to refer to. All appendices should be clearly labelled and referred to where appropriate in the main text (e.g. ‘See Appendix A for an example questionnaire’). Types of Report Type 1. Formal or Informal Reports Formal reports are carefully structured; they stress objectivity and organization, contain much detail, and are written in a style that tends to eliminate such elements as personal pronouns. Informal reports are usually short messages with natural, casual use of language. The internal memorandum can generally be described as an informal report. Type 2. Short or Long Reports This is a confusing classification. A one-page memorandum is obviously short, and a twenty page report is clearly long. But where is the dividing line? Bear in mind that as a report becomes longer (or what you determine as long), it takes on more characteristics of formal reports. Type 3. Informational or Analytical Reports Informational reports (annual reports, monthly financial reports, and reports on personnel absenteeism) carry objective information from one area of an organization to another. Analytical reports (scientific research, feasibility reports, and real-estate appraisals) present attempts to solve problems. Type 4. Proposal Report The proposal is a variation of problem-solving reports. A proposal is a document prepared to describe how one organization can meet the needs of another. Most governmental agencies advertise their needs by issuing “requests for proposal” or RFPs. The RFP specifies a need and potential suppliers prepare proposal reports telling how they can meet that need. Type 5. Vertical or Lateral Reports This classification refers to the direction a report travels. Reports that more upward or downward the hierarchy are referred to as vertical reports; such reports contribute to management control. Lateral reports, on the other hand, assist in coordination in the organization. A report traveling between units of the same organization level (production and finance departments) is lateral. Type 6. Internal or External Report Internal reports travel within the organization. External reports, such as annual reports of companies, are prepared for distribution outside the organization. Type 7. Periodic Reports Periodic reports are issued on regularly scheduled dates. They are generally upward directed and serve management control. Preprinted forms and computer-generated data contribute to uniformity of periodic reports. Type 8. Functional Reports This classification includes accounting reports, marketing reports, financial reports, and a variety of other reports that take their designation from the ultimate use of the report. Almost all reports could be included in most of these categories. And a single report could be included in several classifications. Although authorities have not agreed on a universal report classification, these report categories are in common use and provide a nomenclature for the study (and use) of reports. Reports are also classified on the basis of their format. As you read the classification structure described below, bear in mind that it overlaps with the classification pattern described above. (i) Preprinted Form Basically for “fill in the blank” reports. Most are relatively short (five or fewer pages) and deal with routine information, mainly numerical information. Use this format when it is requested by the person authorizing the report. (ii) Letter Common for reports of five or fewer pages that are directed to outsiders. These reports include all the normal parts of a letter, but they may also have headings, footnotes, tables, and figures. Personal pronouns are used in this type of report. (iii) Memo Common for short (fewer than ten pages) informal reports distributed within an organization. The memo format of “Date,” “To,” “From,” and “Subject” is used. Like longer reports, they often have internal headings and sometimes have visual aids. Memos exceeding ten pages are sometimes referred to as memo reports to distinguish them from shorter ones. (iv) Manuscript Common for reports that run from a few pages to several hundred pages and require a formal approach. As their length increases, reports in manuscript format require more elements before and after the text of the report. Now that we have surveyed the different types of reports and become familiar with the nomenclature, let us move on to the actual process of writing the report. Report Structure: Preliminaries Section, Main Report AKTUTHEINTACTONE5 MAR 2019 2 COMMENTS Simple Report Sections • Introduction, including aims and objectives • Methodology • Findings/results • Discussion • Conclusions and recommendations • References The Sections of a Simple Report- (i) Introduction State what your research/project/enquiry is about. What are you writing about, why and for whom? What are your objectives? What are you trying to show or prove (your hypothesis)? (ii) Methodology State how you did your research/enquiry and the methods you used. How did you collect your data? For example, if you conducted a survey, say how many people were included and how you selected them. Say whether you used interviews or questionnaires and how you analyzed the data. (iii) Findings/Results Give the results of your research. Do not, at this stage, try to interpret the results – simply report them. This section may include graphs, charts, diagrams etc. (clearly labelled). Be very careful about copyright if you are using published charts, tables, illustrations etc. (iv) Discussion Interpret your findings. What do they show? Were they what you expected? Could your research have been done in a better way? (v) Conclusions and Recommendations These should follow on logically from the Findings and Discussion sections. Summarise the key points of your findings and show whether they prove or disprove your hypothesis. If you have been asked to, you can make recommendations arising from your research. (vi) References List all your sources in alphabetical order, using the appropriate University of Hull style. You might find our referencing pages useful. Preliminaries • Title page • Terms of reference, including scope of report • Contents • List of tables and diagrams • Acknowledgements, i.e. thanks to those who helped with the report • Summary, i.e. key points of the report Main part • Introduction • Methodology • Findings/results • Discussion • Conclusions and recommendations Supplementary • References/bibliography • Appendices • Glossary Interpretation of Results, Suggestion and Recommendations AKTUTHEINTACTONE5 MAR 2019 1 COMMENT Many of the targeted recommendations amount to requesting problem and planner developers to be more precise about the requirements for and expectations of their contributions. Because the planners are extremely complex and time consuming to build, the documentation may be inadequate to determine how a subsequent version differs from the previous or under what conditions (e.g., parameter settings, problem types) the planner can be fairly compared. With the current positive trend in making planners available, it behooves the developer to include such information in the distribution of the system. The most sweeping recommendation is to shift the research focus away from developing the best general-purpose planner. Even in the competitions, some of the planners identified as superior have been ones designed for specific classes of problems, e.g., FF and IPP. The competitions have done a great job of exciting interest and encouraging the development and public availability of planners that incorporate the same representation. However, to advance the research, the most informative comparative evaluations are those designed for a specific purpose – to test some hypothesis or prediction about the performance of a planner10. An experimental hypothesis focuses the analysis and often leads naturally to justified design decisions about the experiment itself. For example, Hoffmann and Nebel, the authors of the Fast-Forward (FF) system, state in the introduction to their JAIR paper that FF’s development was motivated by a specific set of the benchmark domains; because the system is heuristic, they designed the heuristics to fit the expectations/needs of those domains [Hoffmann Nebel 2001]. Additionally, in part of their evaluation, they compare to a specific system on which their own system had commonalities and point out the various advantages or disadvantages of their design decisions on specific problems. Follow-up work or researchers comparing their own systems to FF now have a well-defined starting point for any comparison. Recommendation 1: Experiments should be driven by hypotheses. Researchers should precisely articulate in advance of the experiments their expectations about how their new planner or augmentations to an existing planner add to the state of the art. These expectations should in turn justify the selection of problems, other planners and metrics that form the core of the comparative evaluation. A general issue is whether the results are accurate. We reported the results as they are output by the planners. If a planner stated in its output that it had been successful, we took it at face value. However, by examining some of the output, we determined that some claims of successful solution were erroneous – the proposed solution would not work. The only way to ensure that the output is correct is with a solution checker. Drew McDermott used a solution checker in the AIPS98 competition. However, the planners do not all provide output in a compatible format with his checker. Thus, another concern with any comparative evaluation is that the output needs to be cross-checked. Because we are not declaring a winner (i.e., that some planner exhibited superior performance), we do not think that the lack of a solution checker casts serious doubt on our results. For the most part, we have only been concerned with factors that cause the observed success rates to change. Recommendation 2: Just as input has been standardized with PDDL, output should be standardized, at least in the format of returned plans. Another general issue is whether the benchmark sets are representative of the space of interesting planning problems. We did not test this directly (in fact, we are not sure how one could do so), but the clustering of results and observations by others in the planning community suggest that the set is biased toward logistics problems. Additionally, many of the problems are getting dated and no longer distinguish performance. Some researchers have begun to more formally analyze the problem set, either in service of building improved planners (e.g., [Hoffmann Nebel 2001]) or to better understand planning problems. For example, in the related area of scheduling, our group has identified distinctive patterns in the topology of search spaces for different types of classical scheduling problems and has related the topology to performance of algorithms [Watson et al. 2001]. Within planning, Hoffman has examined the topology of local search spaces in some of the small problems in the benchmark collection and found a simple structure with respect to some well-known relaxations [Hoffmann 2001]. Additionally, he has worked out a partial taxonomy, based on three characteristics, for the analyzed domains. Helmert has analyzed the computational complexity of a subclass of the benchmarks, transportation problems, and has identified key features that affect the difficulty of such problems [Helmert 2001]. Recommendation 3: The benchmark problem sets should themselves be evaluated and over-hauled. Problems that can be easily solved should be removed. Researchers should study the benchmark problems/domains to classify them into problem types and key characteristics. Developers should contribute application problems and realistic versions of them to the evolving set.

Smart Learner

Search This Blog

Comments

Post a Comment

Popular posts from this blog

Banking Act Most Importants

NRB IT Guidlines 2068

Business Development plan on the Khaptad Homestay