College professors are going back to paper exams and handwritten essays to fight students using ChatGPT::The growing number of students using the AI program ChatGPT as a shortcut in their coursework has led some college professors to reconsider their lesson plans for the upcoming fall semester.
I think that’s actually a good idea? Sucks for e-learning as a whole, but I always found online exams (and also online interviews) to be very easy to game.
Prof here - take a look at it from our side.
Our job is to evaluate YOUR ability; and AI is a great way to mask poor ability. We have no way to determine if you did the work, or if an AI did, and if called into a court to certify your expertise we could not do so beyond a reasonable doubt.
I am not arguing exams are perfect mind, but I’d rather doubt a few student’s inability (maybe it was just a bad exam for them) than always doubt their ability (is any of this their own work).
Case in point, ALL students on my course with low (<60%) attendance this year scored 70s and 80s on the coursework and 10s and 20s in the OPEN BOOK exam. I doubt those 70s and 80s are real reflections of the ability of the students, but do suggest they can obfuscate AI work well.
I recently finished my degree, and exam-heavy courses were the bane of my existence. I could sit down with the homework, work out every problem completely with everything documented, and then sit to an exam and suddenly it’s “what’s a fluid? What’s energy? Is this a pencil?”
The worst example was a course with three exams worth 30% of the grade, attendance 5% and homework 5%. I had to take the course twice; 100% on HW each time, but barely scraped by with a 70.4% after exams on the second attempt. Courses like that took years off my life in stress. :(
If you don’t mind me asking - what kind of degree was it, and what format were the exams?
Sure; it was Mechanical Engineering. The class was “Vibrations & Controls;” the first half of the course was vibrations / oscillatory systems, and then the second half was theory of feedback & control systems (classic “PID” controllers for the most part). The exams were pencil-and-paper, in-person, time-limited.
The first attempt we were allowed nothing except the exam and paper for answers; honestly I’m not sure what that professor was expecting.
In my second attempt the professor provided a formula sheet, but he was of the mindset of “If you know F=ma, you can derive anything you need!” so the formula sheets were sparse to put it mildly. It was just enough to keep me from fully collapsing in panic and bombing, but it was close.
Thanks for the info!
If you’d been able to take 4 sides (A4) of written notes in, would this have helped mitigate the stress?
What do you feel would have been a better method of assessment?
Being able to bring my own formula sheet (or notes) definitely helped. Two full pages of notes would be great, though I would still get some bad nerves even in those cases (the very idea that the next 60 minutes of class time decides a full 30% of the course grade just rattled me bad).
For me the ideal type of course would be the Thermodynamics of Mechanical Systems course I took. The exams were in-person but open-note and straightforward with relatively simple conceptual questions. Credit was split between the exams and bi-monthly “mini projects.” These would ask you to apply the class concepts to some larger set of related problems; parameters were provided and you would have to determine the answers using what was learned in class (for example, one project was to design a steam turbine power plant with a target output of 50MW, ambient temperature was 30C, cooling water is available at 25C. Determine the heat input needed from the boiler, choose an appropriate number of turbine stages with reheat if possible, size the condenser appropriately and add economizers if they can be used. You’d lay it all out and indicate the temperatures, pressures, power inputs and outputs, exergy of the system, etc.)
I did stellar in that class. I would have loved that format everywhere (simple concept exams + application projects).
The true engineering experience is exams that ask you derive God after your homework was just 2+2. I remember hearing a rumor once that the exams were to find students who would be good to help with the professor’s research.
Now that I’m on the other side of the degree with a couple years, I do think those tests were the crucible that turned us into engineers. Working through daunting, impossible questions under stress is how we developed our problem solving capability.
I do think though there’s vast improvements to still be made. It’s highlighted in just how many of us have anxiety and depression and become nervous wrecks. Make sure to take care of yourself and see professionals to help with that, if you need it.
Graduated a year ago, just before this AI craze was a thing.
I feel there’s a social shift when it comes to education these days. It’s mostly: “do 500 - 1,000 word essay to get 1.5% of your grade”. The education doesn’t matter anymore, the grades do; if you pick something up along the way, great! But it isn’t that much of a priority.
I think it partially comes from colleges squeezing students of their funds, and indifferent professors who just assign busywork for the sake of it. There are a lot of uncaring professors that just throw tons of work at students, turning them back to the textbook whenever they ask questions.
However, I don’t doubt a good chunk of students use AI on their work to just get it out of the way. That really sucks and I feel bad for the professors that actually care and put effort into their classes. But, I also feel the majority does it in response to the monotonous grind that a lot of other professors give them.
Here’s a somewhat tangential counter, which I think some of the other replies are trying to touch on … why, exactly, continue valuing our ability to do something a computer can so easily do for us (to some extent obviously)?
In a world where something like AI can come up and change the landscape in a matter of a year or two … how much value is left in the idea of assessing people’s value through exams (and to be clear, I’m saying this as someone who’s done very well in exams in the past)?
This isn’t to say that knowing things is bad or making sure people meet standards is bad etc. But rather, to question whether exams are fit for purpose as means of measuring what matters in a world where what’s relevant, valuable or even accurate can change pretty quickly compared to the timelines of ones life or education. Not long ago we were told that we won’t have calculators with us everywhere, and now we could have calculators embedded in our ears if wanted to. Analogously, learning and examination is probably being premised on the notion that we won’t be able to look things up all the time … when, as current AI, amongst other things, suggests, that won’t be true either.
An exam assessment structure naturally leans toward memorisation and being drilled in a relatively narrow band of problem solving techniques,1 which are, IME, often crammed prior to the exam and often forgotten quite severely pretty soon afterward. So even presuming that things that students know during the exam are valuable, it is questionable whether the measurement of value provided by the exam is actually valuable. And once the value of that information is brought into question … you have to ask … what are we doing here?
Which isn’t to say that there’s no value created in doing coursework and cramming for exams. Instead, given that a computer can now so easily augment our ability to do this assessment, you have to ask what education is for and whether it can become something better than what it is given what are supposed to be the generally lofty goals of education.
In reality, I suspect (as many others do) that the core value of the assessment system is to simply provide a filter. It’s not so much what you’re being assessed on as much as your ability to pass the assessment that matters, in order to filter for a base level of ability for whatever professional activity the degree will lead to. Maybe there are better ways of doing this that aren’t so masked by other somewhat disingenuous goals?
Beyond that there’s a raft of things the education system could emphasise more than exam based assessment. Long form problem solving and learning. Understanding things or concepts as deeply as possible and creatively exploring the problem space and its applications. Actually learn the actual scientific method in practice. Core and deep concepts, both in theory and application, rather than specific facts. Breadth over depth, in general. Actual civics and knowledge required to be a functioning member of the electorate.
All of which are hard to assess, of course, which is really the main point of pushing back against your comment … maybe we’re approaching the point where the cost-benefit equation for practicable assessment is being tipped.
- In my experience, the best means of preparing for exams, as is universally advised, is to take previous or practice exams … which I think tells you pretty clearly what kind of task an exam actually is … a practiced routine in something that narrowly ranges between regurgitation and pretty short-form, practiced and shallow problem solving.
Ah the calculator fallacy; hello my old friend.
So, a calculator is a great shortcut, but it’s useless for most mathematics (i.e. proof!). A lot of people assume that having a calculator means they do not need to learn mathematics - a lot of people are dead wrong!
In terms of exams being about memory, I run mine open book (i.e. students can take pre-prepped notes in). Did you know, some students still cram and forget right after the exams? Do you know, they forget even faster for courseworks?
Your argument is a good one, but let’s take it further - let’s rebuild education towards an employer centric training system, focusing on the use of digital tools alone. It works well, productivity skyrockets, for a few years, but the humanities die out, pure mathematics (which helped create AI) dies off, so does theoretical physics/chemistry/biology. Suddenly, innovation slows down, and you end up with stagnation.
Rather than moving us forward, such a system would lock us into place and likely create out of date workers.
At the end of the day, AI is a great tool, but so is a hammer and (like AI today), it was a good tool for solving many of the problems of its time. However, I wouldn’t want to only learn how to use a hammer, otherwise how would I be replying to you right now?!?
So … I honestly think this is a problematic reply … I think you’re being defensive (and consequently maybe illogical), and, honestly, that would be the red flag I’d look for to indicate that there’s something rotten in academia. Otherwise, there might be a bit of a disconnect here … thoughts:
- The
calculator
was in reference to arithmetic and other basic operations and calculations using them … not higher level (or actual) mathematics. I think that was pretty clear and I don’t think there’s any “fallacy” here, like at all. - The
value of learning (actual) mathematics
is pretty obvious I’d say … and was pretty much stated in my post about alternatives to emphasise. On which, getting back to my essential point … how would one best learn and be assessed on their ability to construct proofs in mathematics? Are timed open book exams (and studying in preparation for them) really the best we’ve got!? Still forgetting with open book exams
… seems like an obvious outcome as the in-exam materials de-emphasise memory … they probably never knew the things you claim they forget in the first place. Why, because the exam only requires the students to be able to regurgitate in the exam, which is the essential problem, and for which in-exam materials are a perfect assistant. Really not sure what the relevance of this point is.Forgetting after coursework
… how do you know this (genuinely curious)? Even so, course work isn’t the great opposite to exams. Under the time crunch of university, they are also often crammed, just not in an examination hall. The alternative forms of education/assessment I’m talking about are much more long-form and exploration and depth focused. The most I’ve ever remembered from a single semester subject came from when I was allowed to pursue a single project for the whole subject. Also, I didn’t mention ordinary general coursework in my post, as, again, it’s pretty much the same paradigm of education as exams, just done at home for the most part.Rebuilding education toward employer centric training system
… I … ummm … never suggested this … I suggested the opposite … only things that were far more “academic” than this and were never geared toward “productivity”. This is a pretty bad staw man argument for a professor to be making, especially given that it seems constructed to conclude that the academy and higher learning are essential for the future success of the economy (which I don’t disagree with or even question in my post).- You speak about AI a lot. Maybe your whole reply was solely to the whole calculator point I made. This, I think, misses the broader point, which most of my post was dedicated to. That is, this isn’t about us now needing to use AI in education (I personally don’t buy that at all for probably much of the same reason you’d push back on it). Instead, it’s about what it means about our education system that AI can kinda do the thing we’re using to assess ourselves … on which I say, it tells us that the value of assessment system we take pretty seriously ought to be questioned, especially, as I think we both agree on, given the many incredibly valuable things education has to offer the individual and society at large. In my case, I go further and say that the assessment system is and has already detracted from these potential offerings, and that it does not bode well for modern western society that it seems to be leaning into the assessment system rather than expanding its scope.
OK Mr Socrates how else would you assess whether a student has learned something?
Ha … well if I had answers I probably wouldn’t be here! But seriously, I do think this is a tough topic with lots of tangled threads linked to how our society functions. I’m not sure there are any easy “fixes”, I don’t think anyone who thinks that can really be trusted, and it may very well turn out that I’m completely wrong and there is not “better way”, as something flawed and problematic may just turn out to be what humanity needs.
A pretty minor example based on the whole thing of returning to paper exams. What happens when you start forcing students to be judged on their ability to do something, alone, where they know very well that they can do better with an AI assistant? Like at a psychological and cultural level? I don’t know, I’m not sure my generation (Xennial) or earlier ever had that. Even with calculators and arithmetic, it was always about laziness or dealing with big numbers that were impossible for (normal humans), or ensuring accuracy. It may not be the case that AI is at that level yet for many exams and students (I really don’t know), but it might be or might be soon. However valuable it is to force students to learn to do the task without the AI, there’s gotta be some broad cultural effect in just ignoring the super useful machine.
Otherwise, my general ideas would be to emphasise longer form work (which AI is not terribly useful for). Work that requires creativity, thinking, planning, coherent understanding, human-to-human communication and collaboration. So research projects, actual practical work, debates, teaching as a form of assessment etc. In many ways, the idea of “having learned something” becomes just a baseline expectation. Exams, for instance, may still hold lots of value, but not as forms of objective assessment, but as a way of calibrating where you’re up to on the basic requirements to start the real “assessment” and what you still need to work on.
Also …
OK Mr Socrates
… is maybe not the most polite way of engaging here … comes off as somewhat aggressive TBH.
- The
In my experience, they love to give exams where it doesn’t matter what notes you bring, you’re on the same level whether you write down only the essential equations, or you copy down the whole textbook.
As they are talking about writing essays, I would argue the importance of being able to do it lies in being able to analyze a book/article/whatever, make an argument, and defend it. Being able to read and think critically about the subject would also be very important.
Sure, rote memorization isn’t great, but neither is having to look something up every single time you ever need it because you forgot. There are also many industries in which people do need a large information base as close recall. Learning to do that much later in life sounds very difficult. I’m not saying people should memorize everything, but not having very many facts about that world around you at basic recall doesn’t sound good either.
Learning to do that much later in life sounds very difficult
That’s an interesting point I probably take for granted.
Nonetheless, exercising memory is probably something that could be done in a more direct fashion, and therefore probably better, without that concern affecting the way we approach all other forms of education.
Student here - How does that cursive longhand thing go again?
“Avoid at all costs because we hate marking it even more than you hate writing it”?
An in person exam can be done in a locked down IT lab, and this leads to a better marking experience, and I suspect a better exam experience!
Is AI going to go away?
In the real world, will those students be working from a textbook, or from a browser with some form of AI accessible in a few years?
What exactly is being measured and evaluated? Or has the world changed, and existing infrastructure is struggling to cling to the status quo?
Were those years of students being forced to learn cursive in the age of the computer a useful application of their time? Or math classes where a calculator wasn’t allowed?
I can hardly think just how useful a programming class where you need to write it on a blank page of paper with a pen and no linters might be, then.
Maybe the focus on where and how knowledge is applied needs to be revisited in light of a changing landscape.
For example, how much more practically useful might test questions be that provide a hallucinated wrong answer from ChatGPT and then task the students to identify what was wrong? Or provide them a cross discipline question that expects ChatGPT usage yet would remain challenging because of the scope or nuance?
I get that it’s difficult to adjust to something that’s changed everything in the field within months.
But it’s quite likely a fair bit of how education has been done for the past 20 years in the digital age (itself a gradual transition to the Internet existing) needs major reworking to adapt to changes rather than simply oppose them, putting academia in a bubble further and further detached from real world feasibility.
If you’re going to take a class to learn how to do X, but never actually learn how to do X because you’re letting a machine do all the work, why even take the class?
In the real world, even if you’re using all the newest, cutting edge stuff, you still need to understand the concepts behind what you’re doing. You still have to know what to put into the tool and that what you get out is something that works.
If the tool, AI, whatever, is smart enough to accomplish the task without you actually knowing anything, what the hell are you useful for?
But that’s actually most of the works we have nowadays. IA is replacing repetitive works such as magazine writers or script writers
Writers are repetitive work???
Well, it seems they will be replaced, at least certain writers. https://www.npr.org/2023/05/20/1177366800/striking-movie-and-tv-writers-worry-that-they-will-be-replaced-by-ai Also, callcenters https://www.bbc.com/news/business-65906521 And junior programmers. The problem here it’s not my opinion, those already happened so its not debatable.
I understand that they’ll be replaced, or at least the producers want thant, but I don’t think that’s because of repetitive work, more like they need a lot of them.
As an anecdotal though, I once saw someone simply forwarding (ie. copy and pasting) their exam questions to ChatGPT. His answers are just ChatGPT responses, but paraphrased to make it look less GPT-ish. I am not even sure whether he understood the question itself.
In this case, the only skill that is tested… is English paraphrasing.
I’ll field this because it does raise some good points:
It all boils down to how much you trust what is essentially matrix multiplication, trained on the internet, with some very arbitrarily chosen initial conditions. Early on when AI started cropping up in the news, I tested the validity of answers given:
-
For topics aimed at 10–18 year olds, it does pretty well. It’s answers are generic, and it makes mistakes every now and then.
-
For 1st–3rd year degree, it really starts to make dangerous errors, but it’s a good tool to summarise materials from textbooks.
-
Masters+, it spews (very convincing) bollocks most of the time.
Recognising the mistakes in (1) requires checking it against the course notes, something most students manage. Recognising the mistakes in (2) is often something a stronger student can manage, but not a weaker one. As for (3), you are going to need to be an expert to recognise the mistakes (it literally misinterpreted my own work at me at one point).
The irony is, education in its current format is already working with AI, it’s teaching people how to correct the errors given. Theming assessment around an AI is a great idea, until you have to create one (the very fact it is moving fast means that everything you teach about it ends up out of date by the time a student needs it for work).
However, I do agree that education as a whole needs overhauling. How to do this: maybe fund it a bit better so we’re able to hire folks to help develop better courses - at the moment every “great course” you’ve ever taken was paid for in blood (i.e. 50 hour weeks teaching/marking/prepping/meeting arbitrary research requirement).
(1) seems to be a legitimate problem. (2) is just filtering the stronger students from the weaker ones with extra steps. (3) isn’t an issue unless a professor teaching graduate classes can’t tell BS from truth in their own field. If that’s the case, I’d call the professor’s lack of knowledge a larger issue than the student’s.
You may not know this, but “Masters” is about uncovering knowledge nobody had before, not even the professor. That’s where peer reviews and shit like LK-99 happen.
It really isn’t. You don’t start doing properly original research until a year or two into a PhD. At best a masters project is going to be doing something like taking an existing model and applying it to an adjacent topic to the one it was designed for.
-
There are places where analog exams went away? I’d say Sweden has always been at the forefront of technology, but our exams were always pen-and-paper.
Covid forced the transition to electronic exams in many areas.
Same in Germany
You can still have AI write the paper and you copy it from text to paper. If anything, this will make AI harder to detect because it’s now AI + human error during the transferring process rather than straight copying and pasting for students.
When I was in College for Computer Programming (about 6 years ago) I had to write all my exams on paper, including code. This isn’t exactly a new development.
Same. All my algorithms and data structures courses in undergrad and grad school had paper exams. I have a mixed view on these but the bottom line is that I’m not convinced they’re any better.
Sure they might reflect some of the student’s abilities better, but if you’re an evaluator interested in assessing student’s knowledge a more effective way is to make directed questions.
What ends up happening a lot of times are implementation questions that ask from the student too much at once: interpretation of the problem; knowledge of helpful data structures and algorithms; abstract reasoning; edge case analysis; syntax; time and space complexities; and a good sense of planning since you’re supposed to answer it in a few minutes without the luxury and conveniences of a text editor.
This last one is my biggest problem with it. It adds a great deal of difficulty and stress without adding any value to the evaluator.
I had some teachers ask for handwritten programming exams too (that was more like 20 years ago for me) and it was just as dumb then as it is today. What exactly are they preparing students for? No job will ever require the skill of writing code on paper.
Well if i go back to school now im fucked i cant read my own hand writting.
as someone with wrist and hand problems that make writing a lot by hand, I’m so lucky i finished college in 2019
Might as well go back to oral exams and ask the student questions on the spot.
That’s actually something that is done (PhD viva). If I had the budget to hire another 6 assistant profs to viva my 120 students, I’d probably do it for my module too!
I love this method and would use it if it weren’t so incredibly time consuming. How are you supposed to test 30 students that way? Nevermind 300.
With AI-based oral testing, of course!
Oh wow … I definitely see someone trying to do that.
Invest now! It’s the future of education.
The best part is there are hand writing generating programs or even web pages that convert text to gcode allowing you to use a 3d printer to write things out. In theory it should be really hard to pass it off as being human written, let alone match your own writing, but I’m sure it will only get better. I think there are even models to try to match someone’s writing.
Cool video if a bit impractical. Also teachers don’t have time to play detective with handwriting comparisons xD
deleted by creator
Removed by mod
It just brings into question what the point of exams are.
AI in its current form is equivalent to the advent of the typewriter. Its just empowering you to do a whole lot more a whole lot faster.
Not using it is dumb.
AI is a tool that can indeed be of great benefit when used properly. But using it without comprehending and verifying the source material can be downright dangerous (like those lawyers citing fake cases). The point of the essay/exam is to test comprehension of the material.
Using AI at this point is like using a typewriter in a calligraphy test, or autocorrect in a spelling and grammar test.
Although asking for handwritten essays does nothing to combat use of AI. You can still generate content and then transcribe it by hand.
That argument is great until someone gets maimed or killed because the “AI” got it wrong and the user didn’t know enough to realize.
You know idiots with AI do that all the time everyday right?
My broader point (in your metaphor) is that calligraphy tests are irrelevant at this point. The world changed. Theres not going back.
A calligraphy test is not irrelevant if you are studying to LEARN calligraphy. If you are arguing that calligraphy as a subject doesnt need to exist then fine then don’t study it. But you don’t learn it by asking AI to do it for you.
Typewriters are also irrelevant today. It was an analogy. I agree that AI can be used in some evaluations, depending what you’re evaluating.
I allow and encourage Googling for information when I interview software engineering candidates. I don’t consider it “cheating”, on the contrary. Being able to unblock themselves is one of the skills they should have. They will be using external help when doing their job, so why should the test be any different.
But that also reminds me now that I actually once had a candidate using generative AI in the coding interview. It did feel like cheating when it was a the level of asking for the full solution, not just help getting unblocked. It didn’t help at all though because the candidate didn’t even have enough skill to tell the good suggestions from the bad ones or what they needed to iterate on.
If that is the case and comprehending the material isn’t necessary then who needs the students in the first place? Just replace them with AI.
It’ll happen sooner than you think
Until the AI offloads the work onto a lesser ai
If your exams can be solved by AI, then your exams are not good enough. How to get around this? Simple. Oral exams aka viva voce. Anyone who had defended their thesis knows the pants shitting terror this kind of exam does to you. It will take longer but you can truly determine how well the student understands the content.
Works for PhD’s, but try doing oral exams for 1000 Bio101 students.