With a huge volume of developers on its platform, Stack Overflow is well versed in the potential pitfalls of AI coding and reckons it knows how to make fledgling code writers feel more at home.
To discuss these topics and more, DevClass caught up with Stack Overflow CEO Prashanth Chandrasekar in the aftermath of the company’s flowstate event in New York City and online.
According to Chandrasekar, “there are 25 to 28 million developers in the world, and about 100 million technologists, including the developers, data scientists and DevOps engineers. We believe we have about 70 to 80 million technologists… who come to our website, 80% on a weekly basis.”
That gives it an influence, and at the event Chandrasekar gave some other statistics: the site has over 100 million monthly visitors, hosts around 52 millions questions and answers, and has a new question every 14 seconds.
The topic of AI coding reared it head again recently when developer Tim Davis, a professor of Computer Science and Engineering at Texas A&M University, said GitHub Copilot “emits large chunks of my copyrighted code, with no attribution, no LGPC license.”
What does Chandrasekar think about AI coding, through which a developer can get help simply by typing a comment in a code editor, and without visiting Stack Overflow? “AI generated code is one way to on-ramp people into becoming tecnologists,” he tells Dev Class. “I don’t think it’s the full answer, but it brings a larger population of people into the sphere of writing code.” That said, “at some point you’re going to need to know what you’re building. You may have to debug it and have no idea what was just built, and it’s hard to skip the learning journey by taking shortcuts.”
Does not the same issue arise when a developer copies and pastes code from Stack Overflow without fully understanding it? “There’s an internal phrase that we use, Stack Overflow is the context for the code. Our mission statement is to empower the world to build technology through collective knowledge, it’s not through code necessarily. Code snippets are part of the answer, but the ‘how’ is equally important … a code snippet with nothing else, no context around it, that’s the worst possible situation, you don’t know you’re doing, but around the code is also an explanation of why … if people don’t read the explanation, that’s on them, right?,” Chandrasekar says.
What about the issue with AI coding where now and again it might generate copyright code but without the accompanying license? Is that a problem for Stack Overflow as well, if contributors provide code that is taken from an open-source project?
“In licensing terms, it’s Creative Commons License and everybody that participates in Stack knows that,” says Chandrasekar. “That’s also the genesis of Stack Overflow Teams, which is the private version of Stack Overflow, to address that problem. Because companies came to us in 2017 and said, ‘60%, 70% of what I want to collaborate on is private to my company. I’d rather not have people posting on the public forum’.”
What if someone does post copyright code? “This is the power of the community,” he says. “It is self-regulating, there are certain rules and standards, and moderators, all for a reason, and those rule sets are governed very heavily.”
It is a fair point, that the community around Stack Overflow is not like the black box of AI coding. That said, the strict application of rules on the platform leads naturally into another question. Is it too intimidating, when those seeking help are shot down for not asking questions in quite the right way?
“One of my earliest experiences at Stack Overflow was a negative experience with the public community when I asked a question. I got slapped on the wrist by my fellow community members saying it was a poorly worded question … I went wow, I feel that way, and I’m not a novice programmer … I wonder what the average brand new developer feels?” Chandrasekar tells us.
He says that over the years many things have been tried in order to solve the problem. “Most recently we’ve done two things. One is the Ask Wizard, which is a guide for people to craft high quality questions. Another is called Staging Ground and its somewhat similar, a place where you can get feedback from people who are there just to provide good feedback.”
The other side of this coin though is the goal to “make sure that the question quality on the website continues to be high and there’s not a lot of duplicate questions and brain damage on the same stuff over and over again”, Chandrasekar adds.
Put like that, perhaps setting a strict quality bar is no bad thing.