Contact Info

Atlas Cloud LLC 600 Cleveland Street Suite 348 Clearwater, FL 33755 USA

[email protected]

Client Area
Recommended Services
Supported Scripts
WordPress
Hubspot
Joomla
Drupal
Wix
Shopify
Magento
Typeo3

OpenAI recently introduced a new AI model named CriticGPT, designed to critique ChatGPT responses. This tool assists human trainers in better evaluating outputs during the reinforcement learning from human feedback process. While not flawless, OpenAI states that CriticGPT enhances the ability of trainers to identify issues.

However, the question arises: Is it wise to integrate more AI into the evaluation process? In the latest episode of our podcast, we discussed this topic with Rob Whiteley, CEO of Coder.

Here is an edited and abridged version of that conversation:

With many people engaging with ChatGPT, issues like hallucinations and copyright violations have come to light. OpenAI has therefore decided to have an untrustworthy AI reviewed by another AI, which is now expected to be more reliable. Does this concern you?

I think initially, the answer might lean towards yes—if pressed for a straightforward response, it might seem overly ambitious. The intricacies lie in your comfort level with adjusting the parameters of Artificial Intelligence. Essentially, if an AI consistently yields incorrect outcomes and is then tasked with self-verification, a key human oversight is bypassed. My interactions with many customers suggest they follow roughly an 80/20 division: 80% of tasks are manageable via AI or generative AI tools, but the remaining 20% still demand human involvement.

It’s concerning to consider delegating that final 20% to the AI for self-review, which could lead us into risky areas. Yet, experience with these tools has shown they are only as effective as the instructions they receive. If you precisely define what the AI should and should not assess—like identifying coding mistakes, logical errors, or bugs, and expressly instructing it to avoid making assumptions or misleading—if unclear, to request guidance—then you significantly improve its performance by setting clear expectations.

The real challenge is whether you can still shape the AI’s function or if it has become a semi-autonomous entity operating in the background. This determines the extent of direct influence you retain over the technology.

So, how much of the rush into AI integration do you believe is simply people moving too quickly with its adoption?

We are currently experiencing a significant hype bubble in technology, particularly around enabling developers to use tools like Copilot or other GenAI technologies. The initial adoption often appears successful, marked by a spike in usage. However, the critical issue is sustainability—whether usage continues in the subsequent weeks, if it’s consistent, and whether it’s genuinely benefiting the users or improving outcomes like reducing bugs or speeding up build times.

This rush to adopt new technologies feels reminiscent of the early days of cloud technology, perceived then as a universal solution. As companies dove in, the hidden costs and potential issues like latency became apparent, yet by then, many felt too committed to backtrack. This “ready, fire, aim” approach to adopting technology can be risky without a well-considered strategy.

While I’m not entirely skeptical of GenAI, I recognize its potential value and possible productivity enhancements. However, I believe it’s crucial for companies to approach technology adoption with a clear business case, conduct rigorous testing, and roll out based on proven results rather than opening the floodgates on mere hopes.

Concerning developers’ perspectives on AI, do they view it as an advantageous tool poised to enhance their work, or do they fear it might threaten their employment? What’s the general consensus?

Coder, being a software company, obviously employs numerous developers. We conducted an internal survey and discovered that 60% of our developers actively use and appreciate a particular technology. However, about 20% tried and subsequently abandoned it, while the remaining 20% have not yet tried it. Considering how new this technology is, its penetration is fairly substantial.

Personally, I see value and adoption, but the 20% who discontinued use concerns me. What led them to stop? Was it due to mistrust, user experience issues, or did it not fit into their development workflow? Achieving a 80% satisfaction rate – understanding that 100% is unrealistic – would be significant. It would mean this technology has fundamentally altered our coding practices. I believe we are on the verge of reaching this milestone quite rapidly, though we are not quite there yet.


Welcome to DediRock, your trusted partner in high-performance hosting solutions. At DediRock, we specialize in providing dedicated servers, VPS hosting, and cloud services tailored to meet the unique needs of businesses and individuals alike. Our mission is to deliver reliable, scalable, and secure hosting solutions that empower our clients to achieve their digital goals. With a commitment to exceptional customer support, cutting-edge technology, and robust infrastructure, DediRock stands out as a leader in the hosting industry. Join us and experience the difference that dedicated service and unwavering reliability can make for your online presence. Launch our website.

Share this Post
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x